파이썬(python) - 크롤링(Crawling) 또는 스크래핑(Scraping)

728x90

먼저, 파이썬에서 크롤링을 처음한다면,

pip install bs4

를 통해 설치를 해줘야한다.

https://www.pythonscraping.com/pages/warandpeace.html

"Well, Prince, so Genoa and Lucca are now just family estates of the Buonapartes. But I warn you, if you don't tell me that this means war, if you still try to defend the infamies and horrors perpetrated by that Antichrist- I really believe he is Antichris

www.pythonscraping.com

위 홈페이지에 있는 HTML이다.

from bs4 import BeautifulSoup
from urllib.request import urlopen

만약 찾는 태그가 여러개라면 제일 처음 나오는 태그만 가져옴.
빨간 글자를 다 가져와보겠다.
print(bs.find('span'))

모든 span 태그를 다 찾는 방법
bs.find_all('span')

보다시피 [ ] 로 감싸져있다. 즉, List로 받아왔다.

spanTagList = bs.find_all('span', class_='green')

for spanTag in spanTagList :
print(spanTag.get_text())

class가 green으로 되어있는것을 변수로 주고, for문으로 돌려줬다.

728x90

저작자표시 비영리 변경금지 (새창열림)

'✨ python > 크롤링(Crawling)' 카테고리의 다른 글

파이썬(python) - 웹크롤링, Selenium (0)	2023.05.15
파이썬(python) - 네이버 뉴스 제목 가져오기 (크롤링) (0)	2023.05.12
파이썬(python) - 크롤링(Crawling) 또는 스크래핑(Scraping) - 2 (0)	2023.05.11

백만장자 개발자

파이썬(python) - 크롤링(Crawling) 또는 스크래핑(Scraping)

'✨ python > 크롤링(Crawling)' 카테고리의 다른 글

댓글

티스토리툴바

파이썬(python) - 크롤링(Crawling) 또는 스크래핑(Scraping)

'✨ python > 크롤링(Crawling)' 카테고리의 다른 글

관련글

댓글

티스토리툴바