Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Youtube comment scraper seems to be down. #49

Open
Hypurl opened this issue Dec 30, 2020 · 3 comments
Open

The Youtube comment scraper seems to be down. #49

Hypurl opened this issue Dec 30, 2020 · 3 comments

Comments

@Hypurl
Copy link

Hypurl commented Dec 30, 2020

Getting error:
`runfile('C:/Users/Jeff/Downloads/untitled1.py', wdir='C:/Users/Jeff/Downloads')
Traceback (most recent call last):

File "C:\Users\Jeff\Downloads\untitled1.py", line 120, in
for count, comment in enumerate(get_comments(url)):

File "C:\Users\Jeff\Downloads\untitled1.py", line 50, in get_comments
data = json.loads(data_str)

File "C:\Users\Jeff\anaconda3\lib\json_init_.py", line 348, in loads
return _default_decoder.decode(s)

File "C:\Users\Jeff\anaconda3\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File "C:\Users\Jeff\anaconda3\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting value`

@sarahelizabeth
Copy link

It's not working for me either. I'm getting the same JSONDecodeError/Traceback that you are.

@x4nth055
Copy link
Owner

x4nth055 commented Apr 25, 2021

I'm having the same issue too, while I'll try to fix the code, please refer to YouTube API tutorial instead (and that's what I'm currently using as well):

https://www.thepythoncode.com/article/using-youtube-api-in-python

Hope this helps!

@ajskateboarder
Copy link

I understand this is an old issue, but I have found a fix for those who are still expecting one.

YouTube now uses var ytInitialData instead of window["ytInitialData"] . You can change line 50, or any line close by that creates the data_str variable from this:

data_str = find_value(
    res.text, 'window["ytInitialData"] = ', num_sep_chars=0, separator="\n"
).rstrip(";")

to this:

data_str = find_value(
    res.text, 'var ytInitialData = ', num_sep_chars=0, separator="\n"
).rstrip(";").split(';</script>')[0]

This script scrapes less comments than it should, although I don't know if it was always like that or not.
Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants