GithubHelp home page GithubHelp logo

xiyaowong / spiders Goto Github PK

View Code? Open in Web Editor NEW
622.0 21.0 209.0 3.67 MB

Python爬虫,返回一定格式的信息,下载,使用flask提供简易api。抖音无水印、皮皮虾、快手、网易云音乐、qq音乐、咪咕音乐、荔枝FM音频、知乎视频、最右语音、视频、微博......

License: MIT License

Python 99.88% Shell 0.12%
qqmusic 163music douyin kuaishou tudou lizhifm zhihu zuiyou music video spider python downloader

spiders's Issues

修复bilibili视频下载

import re
import requests


def get(url: str) -> dict:
    """
    imgs、videos
    """
    data = {}
    headers = {
        "user-agent":
        "Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A372 Safari/604.1",
        "Referer": "https://www.bilibili.com/",
    }

    av_number_pattern = r'(BV[0-9a-zA-Z]*)'
    cover_pattern = r"image: '(.*?)',"
    video_pattern = r"video_url: '(.*?)',"
    title_pattern = r'title":"(.*?)",'

    av = re.findall(av_number_pattern, url)
    if av:
        av = av[0]
    else:
        data["msg"] = "链接可能不正确,因为我无法匹配到av号"
        return data
    url = f"https://www.bilibili.com/video/{av}"

    with requests.get(url, headers=headers, timeout=10) as rep:
        if rep.status_code == 200:
            cover_url = re.findall(cover_pattern, rep.text)
            if cover_url:
                cover_url = cover_url[0]
                if '@' in cover_url:
                    cover_url = cover_url[:cover_url.index('@')]
                data["imgs"] = ['https:'+cover_url]

            video_url = re.findall(video_pattern, rep.text)
            title_text = re.findall(title_pattern, rep.text)
            if video_url:
                video_url = video_url[0]
                data["videos"] = ['https:' + video_url.replace('upos-hz-mirrorakam.akamaized.net','upos-sz-mirrorkodo.bilivideo.com')]
            if title_text:
                data["videoName"] = title_text[0]
        else:
            data["msg"] = "获取失败"
        return data


if __name__ == "__main__":
    print(get(input("url: ")))

皮皮虾的不行了

皮皮虾的无法获取到链接了,博主什么时候有时间,给更新一下呗。

kuaishou

I seems your fix that allows downloads for kuaishou has broken the desktop links again, the same error 'title': '快手,记录世界记录你'

西瓜视频

按照流程先来三连.
楼主牛批,楼主牛批,楼主牛批!
希望能添加西瓜视频的,好像也是支持无水印的!

Douyin issue

The douyin.py produces all the valid information and video + audio The video link cannot be downloaded

短视频的链接能爬吗

我想弄一个自动化,想问下怎么自动根据关键字爬取短视频链接,然后下载下来

kuaishou

Traceback (most recent call last):
File "kuaishou.py", line 53, in
Rprint(get(input("url: ")))
NameError: name 'Rprint' is not defined

set back to pprint and it's perfect

大神你好 acfun的你的接口是m3u的可以增加mp4的吗?

很棒,给个赞

我有个邪恶的想法,把这个库的python代码移植到android中,直接用安卓手机直接就能使用。博主支持不?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.