BeautifulSoup, HTML 구조에서 특정 태그의 요소 다루기

2022. 6. 18. 12:03

BeautifulSoup, HTML 구조에서 특정 태그의 요소 다루기

글. 수알치 오상문

다음과 같은 HTML 페이지가 있을 때, 가져올 값은 datetime에 설정된 값이라고 가정하자.

<html>

<head>

</head>

<body>

<time class="jlist_date_image" datetime="2025-04-02 14:30:12">Idag

</time>

</body>

</html>

혹시 BeautifulSoup 패키지가 없으면 에러가 발생하니 예제 진행에 앞서서 BeautifulSoup를 설치하자.

pip3 install beautifulsoup4

또는

pip install beautifulsoup4

2022.06.18 현재 버전은 beautifulsoup4-4.11.1이다.

[소스 코드] 예제

# pip3 install beautifulsoup4 
# 또는 pip install beautifulsoup4
# 예제 버전: beautifulsoup4-4.11.1
from bs4 import BeautifulSoup
html = '''<html>
<head>
</head>
<body>
    <time class="jlist_date_image" datetime="2025-04-02 14:30:12">Idag
       <span class="list_date">14:30</span>
</time>
</body>
</html>'''
soup = BeautifulSoup(html, features="html.parser")
# 방법 1 
for i in soup.findAll('time'):
    if i.has_attr('datetime'):
        print(i['datetime'])
# 방법 2 
val = BeautifulSoup(html, features="html.parser").time.attrs['datetime']
print(val)

[실행 결과]
2025-04-02 14:30:12
2025-04-02 14:30:12

저작자표시 비영리 변경금지 (새창열림)

'웹 크롤링, 스크래핑' 카테고리의 다른 글

Microsoft Edge (Chromium) 대상으로 한 Selenium WebDriver 사용 (0)	2022.06.21
Microsoft Edge Web Driver 공식 다운로드 사이트 (0)	2022.06.21
웹 스크래핑: 캘린더 다루기 (with Selenium) 영문 (0)	2022.06.17
BeautifulSoup, selenium 크롤링, 스크래핑 (0)	2022.06.17
chromedriver.exe 다운로드(download) 사이트 (0)	2021.05.18

수알치 블로그

BeautifulSoup, HTML 구조에서 특정 태그의 요소 다루기

BeautifulSoup, HTML 구조에서 특정 태그의 요소 다루기

글. 수알치 오상문

'웹 크롤링, 스크래핑' 카테고리의 다른 글

+ Recent posts

티스토리툴바