Not sure if this is the right community but seems close enough.

Ideally i want a url that i can just put any paywalled news article into that will return the unpaywalled version.

Ie: https://somedomain/https://somenewssite/somenewsartle

I need it to work with https://pypi.org/project/newspaper4k/

Alternativly if someone knows of another python library that can extract article text and images automaticly just from a link that would also solve my problem.

  • _cryptagion@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 days ago

    12ft works, if you really need to. But in general, I just don’t read any publications that paywall their content. Mass media is all owned by one or two billionaires, if they need money they can get it from them.

  • Byter@lemmy.one
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 days ago

    Looks like newspaper4k uses headless Chrome. You could try loading the Bypass Paywalls Clean extension and browsing the pages directly.

    I regularly use it (in Firefox) without even thinking about it. Only notice when I send someone an article they can’t access.