Compare commits

...

57 Commits

Author SHA1 Message Date
pukkandan
c5640c4508 Release 2021.03.07 2021-03-08 00:06:26 +05:30
teesid
1f52a09e2e [vimeo] Fix videos with password
https://github.com/ytdl-org/youtube-dl/pull/27992

Fixes: https://github.com/ytdl-org/youtube-dl/issues/28354

Authored by teesid
2021-03-07 23:47:53 +05:30
pukkandan
fc21af505c Fix some videos downloading with m3u8 extension 2021-03-07 23:22:12 +05:30
pukkandan
015f3b3120 [bilibili] Change Accept header (Closes #145)
This is a temporary fix. Ideally we should find a more reasonable accept string that just "*/*"

Fixes: https://github.com/ytdl-org/youtube-dl/issues/28363 https://github.com/ytdl-org/youtube-dl/issues/28341

Thanks to animelover1984 for identifying the problem
2021-03-07 17:59:59 +05:30
Ashish
5ba4a0b69c [Documentation] Inclusion of two-line install script for Unix (#155)
Closes #83
Authored-by: Ashish <ashish@pop-os.localdomain>

ci skip all
2021-03-07 15:29:01 +05:30
nixxo
0852947fcc [rai] Check for DRM (#150)
Authored by: nixxo <nixxo@protonmail.com>
2021-03-07 13:01:59 +05:30
pukkandan
99594a11ce Remove "fixup is ignored" warning when fixup wasn't passed by user
Closes #151
2021-03-07 12:32:59 +05:30
pukkandan
2be71994c0 [youtube] Detect when Mixes end or wrap around 2021-03-07 11:04:57 +05:30
pukkandan
26fe8ffed0 [youtube] Fix community page continuation (Closes #152) 2021-03-07 11:04:55 +05:30
nixxo
feee67ae88 [gedi] Improvements from youtube-dl (#149)
Authored-by: nixxo <c.nixxo@gmail.com>
2021-03-06 23:40:32 +05:30
Ashish
1caaf92d47 [MXPlayer] Rewrite extractor with show support (#141)
Co-authored-by: Ashish <ashish@pop-os.localdomain>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
2021-03-06 01:11:02 +05:30
Matthew
d069eca7a3 [Youtube] Fix private feeds/playlists on multi-channel accounts (#143)
Authored by: colethedj
2021-03-05 19:29:14 +05:30
Matthew
f3eaa8dd1c [Youtube] Extract alerts from continuation (#144)
Related: #143

Authored by: colethedj
2021-03-05 15:37:32 +05:30
pukkandan
9e631877f8 [downloader] Fix bug for ffmpeg/httpie
Caused by: 7f7de7f94d
2021-03-05 04:22:37 +05:30
pukkandan
36147a63e3 [trovo] Pass origin header (Closes #139)
Fixes: https://github.com/ytdl-org/youtube-dl/issues/28346
2021-03-04 23:59:37 +05:30
pukkandan
57db6a87ef [lbry] Support lbry:// url
https://github.com/ytdl-org/youtube-dl/pull/28207

Fixes: https://github.com/ytdl-org/youtube-dl/issues/28084

Authored by: nixxo <nixxo@protonmail.com>
2021-03-04 23:45:28 +05:30
pukkandan
cd7c66cf01 [youtube] Fix history, trending and mix playlists (#136)
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Co-authored-by: Matthew <colethedj@protonmail.com>
2021-03-04 23:35:26 +05:30
shirt-dev
2c736b4f61 [cbs] Add support for ParamountPlus (#138)
Related: https://github.com/ytdl-org/youtube-dl/issues/28342

Authored-by: shirtjs <2660574+shirtjs@users.noreply.github.com>
2021-03-04 20:20:07 +05:30
pukkandan
c4a508ab31 [update] Fix updater removing the executable bit on some UNIX distros
Closes #133
2021-03-03 19:07:14 +05:30
pukkandan
7815e55572 [update] Fix current build hash for UNIX 2021-03-03 19:02:21 +05:30
pukkandan
162e6f0000 [version] update :ci skip all 2021-03-03 16:42:23 +05:30
pukkandan
a8278ababd Release 2021.03.03.2 2021-03-03 16:34:14 +05:30
pukkandan
bd9ed42387 [build] fix bug from da7f321e93 2021-03-03 16:31:27 +05:30
pukkandan
5f7514957f Release 2021.03.03 2021-03-03 16:27:55 +05:30
pukkandan
3721515bde Update to ytdl-2021.03.03 2021-03-03 16:04:01 +05:30
Matthew
a5c5623470 [YouTube] Use new browse API for continuation page extraction. (#131)
Known issues (these issues existed in previous API as well)
* Mix playlists only give 1 page (25 vids)
* Trending only gives 1 video
* History gives 5 pages (200 vids)

Co-authored-by: colethedj, pukkandan
2021-03-03 16:02:40 +05:30
pukkandan
c705177da2 [youtube] Throw error when --extractor-retries are exhausted (Closes #130) 2021-03-03 03:05:31 +05:30
pukkandan
d6e51845b7 Reduce default of --extractor-retries to 3
so that even those not using sleep won't get 429'd on youtube
2021-03-03 03:04:08 +05:30
hseg
da7f321e93 Fix packaging bugs (#129)
* Autogenerate `AUTHORS`
* Fix `setup.py` using wrong completion files
* Complete `ChangeLog` -> `Changelog.md` rename
* Make `make tar` respect DESTDIR
* Remove `bin/` `yt-dlp` and `docs/` from tar and sdist
* Make `pypi-files` build all files needed for `python setup.py`
* Add `completions` alias
* Add `devscripts/` and `supportedsites.md` to pip sdist
* Remove `man` target
* Remove `README.txt` from sdist
* Make `clean` more granular
* Move aliases to top

Authored by: hseg <gesh@gesh.uni.cx>
2021-03-03 02:17:44 +05:30
Ashutosh Chaudhary
097b056c5a [mxplayer] Add new extractor
https://github.com/ytdl-org/youtube-dl/pull/27325
Authored by: codeasashu
2021-03-02 17:49:48 +05:30
Han Dai
f3b737ed19 [nick] fix extraction
https://github.com/ytdl-org/youtube-dl/pull/27900
Authored by: DennyDai
2021-03-02 17:02:45 +05:30
pukkandan
ee1e05581e [mtv] Fix extractor by reverting changes made in youtube-dlc
youtube-dl has since fixed the extractor and the changes from the two sources are incompatible
2021-03-02 16:55:17 +05:30
pukkandan
ec5e77c558 Update to ytdl-2021.03.02 2021-03-02 13:56:07 +05:30
shirt-dev
b3b30a4bca Fix HLS playlist downloading (#127)
Co-authored-by: shirtjs <2660574+shirtjs@users.noreply.github.com>
2021-03-01 12:05:45 -05:00
pukkandan
5372545ddb [version] update :ci skip 2021-03-01 05:46:00 +05:30
pukkandan
5ef7d9bdd8 Release 2021.03.01 2021-03-01 05:39:50 +05:30
pukkandan
62bff2c170 Add option --extractor-retries to retry on known extractor errors
* Currently only used by youtube

Fixes https://github.com/ytdl-org/youtube-dl/issues/28194
Possibly also fixes: https://github.com/ytdl-org/youtube-dl/issues/28289 (can not confirm since the issue isn't reliably reproducible)
2021-03-01 05:18:37 +05:30
pukkandan
f0884c8b3f Cleanup some code (see desc)
* `--get-comments` doesn't imply `--write-info-json` if `-J`, `-j` or `--print-json` are used
* Don't pass `config_location` to `YoutubeDL` (it is unused)
* [bilibiliaudio] Recognize the file as audio-only
* Update gitignore
* Fix typos
2021-02-28 20:56:32 +05:30
pukkandan
277d6ff5f2 Extract comments only when needed #95 (Closes #94) 2021-02-28 20:26:08 +05:30
pukkandan
1cf376f55a Add option --sleep-requests to sleep b/w requests (Closes #106)
* Also fix documentation of `sleep_interval_subtitles`

Related issues:
https://github.com/blackjack4494/yt-dlc/issues/158
https://github.com/blackjack4494/youtube-dlc/issues/195
https://github.com/ytdl-org/youtube-dl/pull/28270
https://github.com/ytdl-org/youtube-dl/pull/28144
https://github.com/ytdl-org/youtube-dl/issues/27767
https://github.com/ytdl-org/youtube-dl/issues/23638
https://github.com/ytdl-org/youtube-dl/issues/26287
https://github.com/ytdl-org/youtube-dl/issues/26319
2021-02-27 18:14:42 +05:30
pukkandan
7f7de7f94d Allow specifying path in --external-downloader 2021-02-27 16:52:27 +05:30
pukkandan
86878b6cd9 [hrfensehen] Fix wrong import 2021-02-27 15:35:41 +05:30
pukkandan
b3d1242534 [youtube] Fix inconsistent webpage_url (closes #119) 2021-02-27 14:45:56 +05:30
pukkandan
9bd2020476 [hls] Enable --hls-use-mpegts by default when downloading live-streams
* Also added option `--no-hls-use-mpegts` to disable this

Related: #96
2021-02-26 21:52:16 +05:30
pukkandan
ed9b7e3dd3 Fix bug with m3u8 format extraction 2021-02-26 18:32:28 +05:30
shirt-dev
c552ae8838 Fix get_executable_path (#117)
Authored-by: shirtjs <2660574+shirtjs@users.noreply.github.com>
2021-02-26 04:28:02 +05:30
Robin Dunn
31a5e037a7 [viki] Fix viki play pass authentication (#111)
Authored by: RobinD42
2021-02-26 03:33:00 +05:30
pukkandan
3638226215 [ci] Disable download tests unless specifically invoked
Tests can be enabled/disabled using the following in the commit message
* Run Download: `ci-run-dl`
* Skip Core: `ci-skip`
* Skip Quick & Core: `ci-skip-all`
(replace "-" by a space " ")
2021-02-26 03:28:18 +05:30
pukkandan
14fdfea973 [youtube] Retry on incomplete ytInitialData
Related: #116
2021-02-26 03:23:08 +05:30
shirt-dev
b45d4e4a8e Fix completion paths, zsh pip completion install (#114) 2021-02-25 11:00:29 -05:00
pukkandan
3e39273418 Merge branch 'master' into fix-paths 2021-02-25 21:17:46 +05:30
shirt-dev
b965087396 Readthedocs improvements (#115)
Authored-by: shirtjs <2660574+shirtjs@users.noreply.github.com>

:ci skip dl
2021-02-25 21:16:08 +05:30
hseg
359d6d8650 Fix completion paths, zsh pip completion install
Closes: #108, #110
2021-02-25 16:49:57 +02:00
pukkandan
0e0040519b [embedthumbnail] Fix bug with deleting original thumbnail (Closes #113)
:ci skip dl
2021-02-25 18:35:04 +05:30
pukkandan
127d075955 [documentation] Fix typos (Closes #112)
:ci skip all
2021-02-25 16:08:25 +05:30
pukkandan
bce8cbb089 [tennistv] Fix format sorting 2021-02-25 16:07:38 +05:30
pukkandan
aae273ded8 [version] update :ci skip dl 2021-02-25 02:44:10 +05:30
73 changed files with 2949 additions and 2483 deletions

1
.gitattributes vendored Normal file
View File

@@ -0,0 +1 @@
Makefile* text whitespace=-tab-in-indent

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.02.19. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.03.2. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running yt-dlp version **2021.02.19**
- [ ] I've verified that I'm running yt-dlp version **2021.03.03.2**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@@ -44,7 +44,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.02.19
[debug] yt-dlp version 2021.03.03.2
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.02.19. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.03.2. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/yt-dlp/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running yt-dlp version **2021.02.19**
- [ ] I've verified that I'm running yt-dlp version **2021.03.03.2**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.02.19. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.03.2. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running yt-dlp version **2021.02.19**
- [ ] I've verified that I'm running yt-dlp version **2021.03.03.2**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.02.19. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.03.2. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -30,7 +30,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running yt-dlp version **2021.02.19**
- [ ] I've verified that I'm running yt-dlp version **2021.03.03.2**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -46,7 +46,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.02.19
[debug] yt-dlp version 2021.03.03.2
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.02.19. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.03.2. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running yt-dlp version **2021.02.19**
- [ ] I've verified that I'm running yt-dlp version **2021.03.03.2**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -3,7 +3,7 @@ on: [push, pull_request]
jobs:
tests:
name: Core Tests
if: "!contains(github.event.head_commit.message, 'ci skip all')"
if: "!contains(github.event.head_commit.message, 'ci skip')"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true

View File

@@ -3,7 +3,7 @@ on: [push, pull_request]
jobs:
tests:
name: Download Tests
if: "!contains(github.event.head_commit.message, 'ci skip dl') && !contains(github.event.head_commit.message, 'ci skip all')"
if: "contains(github.event.head_commit.message, 'ci run dl')"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true

10
.gitignore vendored
View File

@@ -8,6 +8,7 @@ dist/
zip/
tmp/
venv/
completions/
# Misc
*~
@@ -24,7 +25,9 @@ updates_key.pem
*.class
# Generated
AUTHORS
README.txt
.mailmap
*.1
*.bash-completion
*.fish
@@ -34,8 +37,9 @@ README.txt
*.spec
# Binary
youtube-dl
youtube-dlc
/youtube-dl
/youtube-dlc
/yt-dlp
yt-dlp.zip
*.exe
@@ -50,12 +54,14 @@ yt-dlp.zip
*.m4v
*.mp3
*.3gp
*.webm
*.wav
*.ape
*.mkv
*.swf
*.part
*.ytdl
*.dump
*.frag
*.frag.urls
*.aria2

View File

@@ -20,4 +20,3 @@ python:
version: 3
install:
- requirements: docs/requirements.txt
- requirements: requirements.txt

View File

@@ -1,38 +0,0 @@
language: python
python:
- "2.6"
- "2.7"
- "3.2"
- "3.3"
- "3.4"
- "3.5"
- "3.6"
- "pypy"
- "pypy3"
dist: trusty
env:
- YTDL_TEST_SET=core
jobs:
include:
- python: 3.7
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8-dev
dist: xenial
env: YTDL_TEST_SET=core
- env: JYTHON=true; YTDL_TEST_SET=core
- name: flake8
python: 3.8
dist: xenial
install: pip install flake8
script: flake8 .
fast_finish: true
allow_failures:
- env: YTDL_TEST_SET=download
- env: JYTHON=true; YTDL_TEST_SET=core
before_install:
- if [ "$JYTHON" == "true" ]; then ./devscripts/install_jython.sh; export PATH="$HOME/jython/bin:$PATH"; fi
script: ./devscripts/run_tests.sh

View File

@@ -21,5 +21,12 @@ nao20010128nao
kurumigi
tsukumi
bbepis
animelover1984
Pccode66
Ashish
RobinD42
hseg
colethedj
DennyDai
codeasashu
teesid

View File

@@ -17,10 +17,69 @@
-->
### 2021.03.07
* [youtube] Fix history, mixes, community pages and trending by [pukkandan](https://github.com/pukkandan) and [colethedj](https://github.com/colethedj)
* [youtube] Fix private feeds/playlists on multi-channel accounts by [colethedj](https://github.com/colethedj)
* [youtube] Extract alerts from continuation by [colethedj](https://github.com/colethedj)
* [cbs] Add support for ParamountPlus by [shirt](https://github.com/shirt-dev)
* [mxplayer] Rewrite extractor with show support by [pukkandan](https://github.com/pukkandan) and [Ashish](https://github.com/Ashish)
* [gedi] Improvements from youtube-dl by [nixxo](https://github.com/nixxo)
* [vimeo] Fix videos with password by [teesid](https://github.com/teesid)
* [lbry] Support lbry:// url by [nixxo](https://github.com/nixxo)
* [bilibili] Change `Accept` header by [pukkandan](https://github.com/pukkandan) and [animelover1984](https://github.com/animelover1984)
* [trovo] Pass origin header
* [rai] Check for DRM by [nixxo](https://github.com/nixxo)
* [downloader] Fix bug for ffmpeg/httpie
* [update] Fix updater removing the executable bit on some UNIX distros
* [update] Fix current build hash for UNIX
* [documentation] Include wget/curl/aria2c install instructions for Unix by [Ashish](https://github.com/Ashish)
* Fix some videos downloading with `m3u8` extension
* Remove "fixup is ignored" warning when fixup wasn't passed by user
### 2021.03.03.2
* [build] Fix bug
### 2021.03.03
* [youtube] Use new browse API for continuation page extraction by [colethedj](https://github.com/colethedj) and [pukkandan](https://github.com/pukkandan)
* Fix HLS playlist downloading by [shirt](https://github.com/shirt-dev)
* Merge youtube-dl: Upto [2021.03.03](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.03.03)
* [mtv] Fix extractor
* [nick] Fix extractor by [DennyDai](https://github.com/DennyDai)
* [mxplayer] Add new extractor by [codeasashu](https://github.com/codeasashu)
* [youtube] Throw error when `--extractor-retries` are exhausted
* Reduce default of `--extractor-retries` to 3
* Fix packaging bugs by [hseg](https://github.com/hseg)
### 2021.03.01
* Allow specifying path in `--external-downloader`
* Add option `--sleep-requests` to sleep b/w requests
* Add option `--extractor-retries` to retry on known extractor errors
* Extract comments only when needed
* `--get-comments` doesn't imply `--write-info-json` if `-J`, `-j` or `--print-json` are used
* Fix `get_executable_path` by [shirt](https://github.com/shirt-dev)
* [youtube] Retry on more known errors than just HTTP-5xx
* [youtube] Fix inconsistent `webpage_url`
* [tennistv] Fix format sorting
* [bilibiliaudio] Recognize the file as audio-only
* [hrfensehen] Fix wrong import
* [viki] Fix viki play pass authentication by [RobinD42](https://github.com/RobinD42)
* [readthedocs] Improvements by [shirt](https://github.com/shirt-dev)
* [hls] Fix bug with m3u8 format extraction
* [hls] Enable `--hls-use-mpegts` by default when downloading live-streams
* [embedthumbnail] Fix bug with deleting original thumbnail
* [build] Fix completion paths, zsh pip completion install by [hseg](https://github.com/hseg)
* [ci] Disable download tests unless specifically invoked
* Cleanup some code and fix typos
### 2021.02.24
* Moved project to an organization [yt-dlp](https://github.com/yt-dlp)
* **Completely changed project name to yt-dlp** by [Pccode66](https://github.com/Pccode66) and [pukkandan](https://github.com/pukkandan)
* **Merge youtube-dl:** Upto [commit/4460329](https://github.com/ytdl-org/youtube-dl/commit/44603290e5002153f3ebad6230cc73aef42cc2cd) (except tmz, gedi)
* Also, `youtube-dlc` config files are no longer loaded
* Merge youtube-dl: Upto [commit/4460329](https://github.com/ytdl-org/youtube-dl/commit/44603290e5002153f3ebad6230cc73aef42cc2cd) (except tmz, gedi)
* [Readthedocs](https://yt-dlp.readthedocs.io) support by [shirt](https://github.com/shirt-dev)
* [youtube] Show if video was a live stream in info (`was_live`)
* [Zee5] Add new extractor by [Ashish](https://github.com/Ashish) and [pukkandan](https://github.com/pukkandan)
@@ -28,17 +87,17 @@
* [tennistv] Fix extractor
* [hls] Support media initialization by [shirt](https://github.com/shirt-dev)
* [hls] Added options `--hls-split-discontinuity` to better support media discontinuity by [shirt](https://github.com/shirt-dev)
* [ffmpeg] Allow passing custom arguments before -i using `--ppa "ffmpeg_i1:ARGS"` synatax
* [ffmpeg] Allow passing custom arguments before -i using `--ppa "ffmpeg_i1:ARGS"` syntax
* Fix `--windows-filenames` removing `/` from UNIX paths
* [hls] Show warning if pycryptodome is not found
* [documentation] Improvements
* Fix documentation of `Extractor Options`
* Document `all` in format selection (Closes #101)
* Document `all` in format selection
* Document `playable_in_embed` in output templates
### 2021.02.19
* **Merge youtube-dl:** Upto [commit/cf2dbec](https://github.com/ytdl-org/youtube-dl/commit/cf2dbec6301177a1fddf72862de05fa912d9869d) (except kakao)
* Merge youtube-dl: Upto [commit/cf2dbec](https://github.com/ytdl-org/youtube-dl/commit/cf2dbec6301177a1fddf72862de05fa912d9869d) (except kakao)
* [viki] Fix extractor
* [niconico] Extract `channel` and `channel_id` by [kurumigi](https://github.com/kurumigi)
* [youtube] Multiple page support for hashtag URLs
@@ -63,7 +122,7 @@
### 2021.02.15
* **Merge youtube-dl:** Upto [2021.02.10](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.10) (except archive.org)
* Merge youtube-dl: Upto [2021.02.10](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.10) (except archive.org)
* [niconico] Improved extraction and support encrypted/SMILE movies by [kurumigi](https://github.com/kurumigi), [tsukumi](https://github.com/tsukumi), [bbepis](https://github.com/bbepis), [pukkandan](https://github.com/pukkandan)
* Fix HLS AES-128 with multiple keys in external downloaders by [shirt](https://github.com/shirt-dev)
* [youtube_live_chat] Fix by using POST API by [siikamiika](https://github.com/siikamiika)
@@ -106,7 +165,7 @@
### 2021.02.04
* **Merge youtube-dl:** Upto [2021.02.04.1](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.04.1)
* Merge youtube-dl: Upto [2021.02.04.1](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.04.1)
* **Date/time formatting in output template:**
* You can use [`strftime`](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) to format date/time fields. Example: `%(upload_date>%Y-%m-%d)s`
* **Multiple output templates:**
@@ -160,7 +219,7 @@
### 2021.01.24
* **Merge youtube-dl:** Upto [2021.01.24](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* Merge youtube-dl: Upto [2021.01.24](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* Plugin support ([documentation](https://github.com/yt-dlp/yt-dlp#plugins))
* **Multiple paths**: New option `-P`/`--paths` to give different paths for different types of files
* The syntax is `-P "type:path" -P "type:path"` ([documentation](https://github.com/yt-dlp/yt-dlp#:~:text=-P,%20--paths%20TYPE:PATH))
@@ -189,7 +248,7 @@
### 2021.01.16
* **Merge youtube-dl:** Upto [2021.01.16](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* Merge youtube-dl: Upto [2021.01.16](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* **Configuration files:**
* Portable configuration file: `./yt-dlp.conf`
* Allow the configuration files to be named `yt-dlp` instead of `youtube-dlc`. See [this](https://github.com/yt-dlp/yt-dlp#configuration) for details
@@ -235,8 +294,7 @@
### 2021.01.08
* **Merge youtube-dl:** Upto [2021.01.08](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.08)
* Extractor stitcher ([1](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc), [2](https://github.com/ytdl-org/youtube-dl/commit/a563c97c5cddf55f8989ed7ea8314ef78e30107f)) have not been merged
* Merge youtube-dl: Upto [2021.01.08](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.08) except stitcher ([1](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc), [2](https://github.com/ytdl-org/youtube-dl/commit/a563c97c5cddf55f8989ed7ea8314ef78e30107f))
* Moved changelog to seperate file
@@ -275,8 +333,8 @@
* Changed video format sorting to show video only files and video+audio files together.
* Added `--video-multistreams`, `--no-video-multistreams`, `--audio-multistreams`, `--no-audio-multistreams`
* Added `b`,`w`,`v`,`a` as alias for `best`, `worst`, `video` and `audio` respectively
* **Shortcut Options:** Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by [h-h-h-h](https://github.com/h-h-h-h) - See [Internet Shortcut Options](README.md#internet-shortcut-options) for details
* **Sponskrub integration:** Added `--sponskrub`, `--sponskrub-cut`, `--sponskrub-force`, `--sponskrub-location`, `--sponskrub-args` - See [SponSkrub Options](README.md#sponskrub-options-sponsorblock) for details
* Shortcut Options: Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by [h-h-h-h](https://github.com/h-h-h-h) - See [Internet Shortcut Options](README.md#internet-shortcut-options) for details
* **Sponskrub integration:** Added `--sponskrub`, `--sponskrub-cut`, `--sponskrub-force`, `--sponskrub-location`, `--sponskrub-args` - See [SponSkrub Options](README.md#sponskrub-sponsorblock-options) for details
* Added `--force-download-archive` (`--force-write-archive`) by [h-h-h-h](https://github.com/h-h-h-h)
* Added `--list-formats-as-table`, `--list-formats-old`
* **Negative Options:** Makes it possible to negate most boolean options by adding a `no-` to the switch. Usefull when you want to reverse an option that is defined in a config file
@@ -285,7 +343,7 @@
* Relaxed validation for format filters so that any arbitrary field can be used
* Fix for embedding thumbnail in mp3 by [pauldubois98](https://github.com/pauldubois98) ([ytdl-org/youtube-dl#21569](https://github.com/ytdl-org/youtube-dl/pull/21569))
* Make Twitch Video ID output from Playlist and VOD extractor same. This is only a temporary fix
* **Merge youtube-dl:** Upto [2021.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
* Merge youtube-dl: Upto [2021.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
* Extractors [tiktok](https://github.com/ytdl-org/youtube-dl/commit/fb626c05867deab04425bad0c0b16b55473841a2) and [hotstar](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc) have not been merged
* Cleaned up the fork for public use

View File

@@ -1,9 +1,9 @@
include README.md
include LICENSE
include AUTHORS
include ChangeLog
include yt-dlp.bash-completion
include yt-dlp.fish
include Changelog.md
include LICENSE
include README.md
include completions/*/*
include supportedsites.md
include yt-dlp.1
recursive-include docs Makefile conf.py *.rst
recursive-include devscripts *
recursive-include test *

View File

@@ -1,12 +1,28 @@
all: yt-dlp doc man
all: yt-dlp doc pypi-files
clean: clean-test clean-dist clean-cache
completions: completion-bash completion-fish completion-zsh
doc: README.md CONTRIBUTING.md issuetemplates supportedsites
man: README.txt yt-dlp.1 yt-dlp.bash-completion yt-dlp.zsh yt-dlp.fish
ot: offlinetest
tar: yt-dlp.tar.gz
# Keep this list in sync with MANIFEST.in
# intended use: when building a source distribution,
# make pypi-files && python setup.py sdist
pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites completions yt-dlp.1 devscripts/* test/*
clean:
rm -rf yt-dlp.1.temp.md yt-dlp.1 yt-dlp.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz yt-dlp.zsh yt-dlp.fish yt_dlp/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png *.spec *.frag *.frag.urls *.frag.aria2 CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
.PHONY: all clean install test tar pypi-files completions ot offlinetest codetest supportedsites
clean-test:
rm -rf *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png *.frag *.frag.urls *.frag.aria2
clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ yt_dlp/extractor/lazy_extractors.py *.spec CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe yt_dlp.egg-info/ AUTHORS .mailmap
clean-cache:
find . -name "*.pyc" -o -name "*.class" -delete
completion-bash: completions/bash/yt-dlp
completion-fish: completions/fish/yt-dlp.fish
completion-zsh: completions/zsh/_yt-dlp
lazy-extractors: yt_dlp/extractor/lazy_extractors.py
PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin
@@ -21,17 +37,12 @@ SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then ech
# set markdown input format to "markdown-smart" for pandoc version 2 and to "markdown" for pandoc prior to version 2
MARKDOWN = $(shell if [ `pandoc -v | head -n1 | cut -d" " -f2 | head -c1` = "2" ]; then echo markdown-smart; else echo markdown; fi)
install: yt-dlp yt-dlp.1 yt-dlp.bash-completion yt-dlp.zsh yt-dlp.fish
install -d $(DESTDIR)$(BINDIR)
install -m 755 yt-dlp $(DESTDIR)$(BINDIR)
install -d $(DESTDIR)$(MANDIR)/man1
install -m 644 yt-dlp.1 $(DESTDIR)$(MANDIR)/man1
install -d $(DESTDIR)$(SYSCONFDIR)/bash_completion.d
install -m 644 yt-dlp.bash-completion $(DESTDIR)$(SYSCONFDIR)/bash_completion.d/yt-dlp
install -d $(DESTDIR)$(SHAREDIR)/zsh/site-functions
install -m 644 yt-dlp.zsh $(DESTDIR)$(SHAREDIR)/zsh/site-functions/_yt-dlp
install -d $(DESTDIR)$(SYSCONFDIR)/fish/completions
install -m 644 yt-dlp.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/yt-dlp.fish
install: yt-dlp yt-dlp.1 completions
install -Dm755 yt-dlp $(DESTDIR)$(BINDIR)
install -Dm644 yt-dlp.1 $(DESTDIR)$(MANDIR)/man1
install -Dm644 completions/bash/yt-dlp $(DESTDIR)$(SHAREDIR)/bash-completion/completions/yt-dlp
install -Dm644 completions/zsh/_yt-dlp $(DESTDIR)$(SHAREDIR)/zsh/site-functions/_yt-dlp
install -Dm644 completions/fish/yt-dlp.fish $(DESTDIR)$(SHAREDIR)/fish/vendor_completions.d/yt-dlp.fish
codetest:
flake8 .
@@ -41,8 +52,6 @@ test:
nosetests --verbose test
$(MAKE) codetest
ot: offlinetest
# Keep this list in sync with devscripts/run_tests.sh
offlinetest: codetest
$(PYTHON) -m nose --verbose test \
@@ -57,12 +66,6 @@ offlinetest: codetest
--exclude test_youtube_signature.py \
--exclude test_post_hooks.py
tar: yt-dlp.tar.gz
.PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion ot offlinetest codetest supportedsites
pypi-files: yt-dlp.bash-completion README.txt yt-dlp.1 yt-dlp.fish
yt-dlp: yt_dlp/*.py yt_dlp/*/*.py
mkdir -p zip
for d in yt_dlp yt_dlp/downloader yt_dlp/extractor yt_dlp/postprocessor ; do \
@@ -92,7 +95,7 @@ issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
$(PYTHON) devscripts/make_supportedsites.py supportedsites.md
README.txt: README.md
pandoc -f $(MARKDOWN) -t plain README.md -o README.txt
@@ -102,29 +105,24 @@ yt-dlp.1: README.md
pandoc -s -f $(MARKDOWN) -t man yt-dlp.1.temp.md -o yt-dlp.1
rm -f yt-dlp.1.temp.md
yt-dlp.bash-completion: yt_dlp/*.py yt_dlp/*/*.py devscripts/bash-completion.in
completions/bash/yt-dlp: yt_dlp/*.py yt_dlp/*/*.py devscripts/bash-completion.in
mkdir -p completions/bash
$(PYTHON) devscripts/bash-completion.py
bash-completion: yt-dlp.bash-completion
yt-dlp.zsh: yt_dlp/*.py yt_dlp/*/*.py devscripts/zsh-completion.in
completions/zsh/_yt-dlp: yt_dlp/*.py yt_dlp/*/*.py devscripts/zsh-completion.in
mkdir -p completions/zsh
$(PYTHON) devscripts/zsh-completion.py
zsh-completion: yt-dlp.zsh
yt-dlp.fish: yt_dlp/*.py yt_dlp/*/*.py devscripts/fish-completion.in
completions/fish/yt-dlp.fish: yt_dlp/*.py yt_dlp/*/*.py devscripts/fish-completion.in
mkdir -p completions/fish
$(PYTHON) devscripts/fish-completion.py
fish-completion: yt-dlp.fish
lazy-extractors: yt_dlp/extractor/lazy_extractors.py
_EXTRACTOR_FILES = $(shell find yt_dlp/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py')
yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
$(PYTHON) devscripts/make_lazy_extractors.py $@
yt-dlp.tar.gz: yt-dlp README.md README.txt yt-dlp.1 yt-dlp.bash-completion yt-dlp.zsh yt-dlp.fish ChangeLog AUTHORS
@tar -czf yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
yt-dlp.tar.gz: README.md yt-dlp.1 completions Changelog.md AUTHORS
@tar -czf $(DESTDIR)/yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \
--exclude '*.kate-swp' \
--exclude '*.pyc' \
@@ -134,8 +132,13 @@ yt-dlp.tar.gz: yt-dlp README.md README.txt yt-dlp.1 yt-dlp.bash-completion yt-dl
--exclude '.git' \
--exclude 'docs/_build' \
-- \
bin devscripts test yt_dlp docs \
ChangeLog AUTHORS LICENSE README.md README.txt \
Makefile MANIFEST.in yt-dlp.1 yt-dlp.bash-completion \
yt-dlp.zsh yt-dlp.fish setup.py setup.cfg \
yt-dlp
devscripts test \
Changelog.md AUTHORS LICENSE README.md supportedsites.md \
Makefile MANIFEST.in yt-dlp.1 completions \
setup.py setup.cfg yt-dlp
AUTHORS: .mailmap
git shortlog -s -n | cut -f2 | sort > AUTHORS
.mailmap:
git shortlog -s -e -n | awk '!(out[$$NF]++) { $$1="";sub(/^[ \t]+/,""); print}' > .mailmap

View File

@@ -11,7 +11,7 @@
[![PyPi Downloads](https://img.shields.io/pypi/dm/yt-dlp?label=PyPi)](https://pypi.org/project/yt-dlp)
[![Doc Status](https://readthedocs.org/projects/yt-dlp/badge/?version=latest)](https://yt-dlp.readthedocs.io)
A command-line program to download videos from youtube.com and many other [video platforms](docs/supportedsites.md)
A command-line program to download videos from youtube.com and many other [video platforms](supportedsites.md)
This is a fork of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) which is inturn a fork of [youtube-dl](https://github.com/ytdl-org/youtube-dl)
@@ -57,7 +57,7 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with youtube-dl v2021.02.10**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with youtube-dl v2021.03.03**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
@@ -92,7 +92,7 @@ See [changelog](Changelog.md) or [commits](https://github.com/yt-dlp/yt-dlp/comm
**PS**: Some of these changes are already in youtube-dlc, but are still unreleased. See [this](Changelog.md#unreleased-changes-in-blackjack4494yt-dlc) for details
If you are coming from [youtube-dl](https://github.com/ytdl-org/youtube-dl), the amount of changes are very large. Compare [options](#options) and [supported sites](docs/supportedsites.md) with youtube-dl's to get an idea of the massive number of features/patches [youtube-dlc](https://github.com/blackjack4494/yt-dlc) has accumulated.
If you are coming from [youtube-dl](https://github.com/ytdl-org/youtube-dl), the amount of changes are very large. Compare [options](#options) and [supported sites](supportedsites.md) with youtube-dl's to get an idea of the massive number of features/patches [youtube-dlc](https://github.com/blackjack4494/yt-dlc) has accumulated.
# INSTALLATION
@@ -103,6 +103,23 @@ You can install yt-dlp using one of the following methods:
* Use pip+git: `python -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp.git@release`
* Install master branch: `python -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp`
UNIX users (Linux, macOS, BSD) can also install the [latest release](https://github.com/yt-dlp/yt-dlp/releases/latest) one of the following ways:
```
sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
```
sudo wget https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -O /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
```
sudo aria2c https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
### UPDATE
Starting from version `2021.02.09`, you can use `yt-dlp -U` to update if you are using the provided release.
If you are using `pip`, simply re-run the same command that was used to install the program.
@@ -122,12 +139,12 @@ You can also build the executable without any version info or metadata by using:
**For Unix**:
You will need the required build tools: `python`, `make` (GNU), `pandoc`, `zip`, `nosetests`
Then simply run `make`. You can also run `make youtube_dlc` instead to compile only the binary without updating any of the additional files
Then simply run `make`. You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files
**Note**: In either platform, `devscripts\update-version.py` can be used to automatically update the version number
# DESCRIPTION
**yt-dlp** is a command-line program to download videos from youtube.com many other [video platforms](docs/supportedsites.md). It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
**yt-dlp** is a command-line program to download videos from youtube.com many other [video platforms](supportedsites.md). It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
yt-dlp [OPTIONS] [--] URL [URL...]
@@ -245,7 +262,7 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
"OUTPUT TEMPLATE" for a list of available
keys) to match if the key is present, !key
to check if the key is not present,
key>NUMBER (like "comment_count > 12", also
key>NUMBER (like "view_count > 12", also
works with >=, <, <=, !=, =) to compare
against a number, key = 'LITERAL' (like
"uploader = 'Mike Smith'", also works with
@@ -317,13 +334,19 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
downloader
--hls-use-mpegts Use the mpegts container for HLS videos,
allowing to play the video while
downloading (some players may not be able
to play it)
--external-downloader NAME Use the specified external downloader.
Currently supports aria2c, avconv, axel,
curl, ffmpeg, httpie, wget
--hls-use-mpegts Use the mpegts container for HLS videos;
allowing some players to play the video
while downloading, and reducing the chance
of file corruption if download is
interrupted. This is enabled by default for
live streams
--no-hls-use-mpegts Do not use the mpegts container for HLS
videos. This is default when not
downloading live streams
--external-downloader NAME Name or path of the external downloader to
use. Currently supports aria2c, avconv,
axel, curl, ffmpeg, httpie, wget
(Recommended: aria2c)
--downloader-args NAME:ARGS Give these arguments to the external
downloader. Specify the downloader name and
the arguments separated by a colon ":". You
@@ -397,7 +420,9 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
--no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc.
--get-comments Retrieve video comments to be placed in the
.info.json file
.info.json file. The comments are fetched
even without this option if the extraction
is known to be quick
--load-info-json FILE JSON file containing the video information
(created with the "--write-info-json"
option)
@@ -485,6 +510,8 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
--bidi-workaround Work around terminals that lack
bidirectional text support. Requires bidiv
or fribidi executable in PATH
--sleep-requests SECONDS Number of seconds to sleep between requests
during data extraction
--sleep-interval SECONDS Number of seconds to sleep before each
download when used alone or a lower bound
of a range for randomized sleep before each
@@ -495,7 +522,8 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
before each download (maximum possible
number of seconds to sleep). Must only be
used along with --min-sleep-interval
--sleep-subtitles SECONDS Enforce sleep interval on subtitles as well
--sleep-subtitles SECONDS Number of seconds to sleep before each
subtitle download
## Video Format Options:
-f, --format FORMAT Video format code, see "FORMAT SELECTION"
@@ -641,7 +669,7 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
similar syntax to the output template can
also be used. The parsed parameters replace
any existing values and can be use in
output templateThis option can be used
output template. This option can be used
multiple times. Example: --parse-metadata
"title:%(artist)s - %(title)s" matches a
title like "Coldplay - Paradise". Example
@@ -686,6 +714,8 @@ Then simply run `make`. You can also run `make youtube_dlc` instead to compile o
directory
## Extractor Options:
--extractor-retries RETRIES Number of retries for known extractor
errors (default is 3), or "infinite"
--allow-dynamic-mpd Process dynamic DASH manifests (default)
(Alias: --no-ignore-dynamic-mpd)
--ignore-dynamic-mpd Do not process dynamic DASH manifests
@@ -805,7 +835,7 @@ The available fields are:
- `dislike_count` (numeric): Number of negative ratings of the video
- `repost_count` (numeric): Number of reposts of the video
- `average_rating` (numeric): Average rating give by users, the scale used depends on the webpage
- `comment_count` (numeric): Number of comments on the video
- `comment_count` (numeric): Number of comments on the video (For some extractors, comments are only downloaded at the end, and so this field cannot be used)
- `age_limit` (numeric): Age restriction for the video (years)
- `is_live` (boolean): Whether this video is a live stream or a fixed-length video
- `was_live` (boolean): Whether this video was originally a live stream

View File

@@ -8,7 +8,7 @@ import sys
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import yt_dlp
BASH_COMPLETION_FILE = "yt-dlp.bash-completion"
BASH_COMPLETION_FILE = "completions/bash/yt-dlp"
BASH_COMPLETION_TEMPLATE = "devscripts/bash-completion.in"

View File

@@ -10,7 +10,7 @@ sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import yt_dlp
from yt_dlp.utils import shell_quote
FISH_COMPLETION_FILE = 'yt-dlp.fish'
FISH_COMPLETION_FILE = 'completions/fish/yt-dlp.fish'
FISH_COMPLETION_TEMPLATE = 'devscripts/fish-completion.in'
EXTRA_ARGS = {

View File

@@ -61,7 +61,7 @@ if ! type pandoc >/dev/null 2>/dev/null; then echo 'ERROR: pandoc is missing'; e
if ! python3 -c 'import rsa' 2>/dev/null; then echo 'ERROR: python3-rsa is missing'; exit 1; fi
if ! python3 -c 'import wheel' 2>/dev/null; then echo 'ERROR: wheel is missing'; exit 1; fi
read -p "Is ChangeLog up to date? (y/n) " -n 1
read -p "Is Changelog up to date? (y/n) " -n 1
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
/bin/echo -e "\n### First of all, testing..."
@@ -75,12 +75,12 @@ fi
/bin/echo -e "\n### Changing version in version.py..."
sed -i "s/__version__ = '.*'/__version__ = '$version'/" yt_dlp/version.py
/bin/echo -e "\n### Changing version in ChangeLog..."
sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Changing version in Changelog..."
sed -i "s/<unreleased>/$version/" Changelog.md
/bin/echo -e "\n### Committing documentation, templates and yt_dlp/version.py..."
make README.md CONTRIBUTING.md issuetemplates supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md yt_dlp/version.py ChangeLog
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md yt_dlp/version.py Changelog.md
git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."
@@ -111,7 +111,7 @@ RELEASE_FILES="yt-dlp yt-dlp.exe yt-dlp-$version.tar.gz"
for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done
ROOT=$(pwd)
python devscripts/create-github-release.py ChangeLog $version "$ROOT/build/$version"
python devscripts/create-github-release.py Changelog.md $version "$ROOT/build/$version"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"

View File

@@ -8,7 +8,7 @@ import sys
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import yt_dlp
ZSH_COMPLETION_FILE = "yt-dlp.zsh"
ZSH_COMPLETION_FILE = "completions/zsh/_yt-dlp"
ZSH_COMPLETION_TEMPLATE = "devscripts/zsh-completion.in"

5
docs/Changelog.md Normal file
View File

@@ -0,0 +1,5 @@
---
orphan: true
---
```{include} ../Changelog.md
```

6
docs/LICENSE.md Normal file
View File

@@ -0,0 +1,6 @@
---
orphan: true
---
# LICENSE
```{include} ../LICENSE
```

2
docs/README.md Normal file
View File

@@ -0,0 +1,2 @@
```{include} ../README.md
```

View File

@@ -7,26 +7,21 @@ import os
# Allows to import yt-dlp
sys.path.insert(0, os.path.abspath('..'))
from recommonmark.transform import AutoStructify
# -- General configuration ------------------------------------------------
# The suffix of source filenames.
source_suffix = ['.rst', '.md']
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'recommonmark',
'myst_parser',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The master toctree document.
master_doc = 'index'
master_doc = 'README'
# General information about the project.
project = u'yt-dlp'
@@ -64,12 +59,10 @@ highlight_language = 'none'
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = ['_static']
# Enable heading anchors
myst_heading_anchors = 4
def setup(app):
app.add_config_value('recommonmark_config', {
'enable_math': False,
'enable_inline_math': False,
'enable_eval_rst': True,
'enable_auto_toc_tree': True,
}, True)
app.add_transform(AutoStructify)
# Suppress heading warnings
suppress_warnings = [
'myst.header',
]

View File

@@ -1 +0,0 @@
../README.md

View File

@@ -1,2 +1 @@
recommonmark>=0.6.0
m2r2
myst-parser

File diff suppressed because it is too large Load Diff

6
docs/ytdlp_plugins.md Normal file
View File

@@ -0,0 +1,6 @@
---
orphan: true
---
# ytdlp_plugins
See [https://github.com/yt-dlp/yt-dlp/tree/master/ytdlp_plugins](https://github.com/yt-dlp/yt-dlp/tree/master/ytdlp_plugins).

View File

@@ -27,8 +27,9 @@ if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("inv")
else:
files_spec = [
('etc/bash_completion.d', ['yt-dlp.bash-completion']),
('etc/fish/completions', ['yt-dlp.fish']),
('share/bash-completion/completions', ['completions/bash/yt-dlp']),
('share/zsh/site-functions', ['completions/zsh/_yt-dlp']),
('share/fish/vendor_completions.d', ['completions/fish/yt-dlp.fish']),
('share/doc/yt_dlp', ['README.txt']),
('share/man/man1', ['yt-dlp.1'])
]
@@ -38,7 +39,7 @@ else:
resfiles = []
for fn in files:
if not os.path.exists(fn):
warnings.warn('Skipping file %s since it is not present. Type make to build all automatically generated files.' % fn)
warnings.warn('Skipping file %s since it is not present. Try running `make pypi-files` first.' % fn)
else:
resfiles.append(fn)
data_files.append((dirname, resfiles))

1247
supportedsites.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -37,7 +37,6 @@ class TestAllURLsMatching(unittest.TestCase):
assertPlaylist('PL63F0C78739B09958')
assertTab('https://www.youtube.com/AsapSCIENCE')
assertTab('https://www.youtube.com/embedded')
assertTab('https://www.youtube.com/feed') # Own channel's home page
assertTab('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
assertTab('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
assertTab('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')

View File

@@ -324,6 +324,8 @@ class YoutubeDL(object):
source_address: Client-side IP address to bind to.
call_home: Boolean, true iff we are allowed to contact the
yt-dlp servers for debugging. (BROKEN)
sleep_interval_requests: Number of seconds to sleep between requests
during extraction
sleep_interval: Number of seconds to sleep before each download when
used alone or a lower bound of a range for randomized
sleep before each download (minimum possible number
@@ -334,6 +336,7 @@ class YoutubeDL(object):
Must only be used along with sleep_interval.
Actual sleep time will be a random float from range
[sleep_interval; max_sleep_interval].
sleep_interval_subtitles: Number of seconds to sleep before each subtitle download
listformats: Print an overview of available video formats and exit.
list_thumbnails: Print a table of all thumbnails and exit.
match_filter: A function that gets called with the info_dict of
@@ -378,6 +381,7 @@ class YoutubeDL(object):
Use 'default' as the name for arguments to passed to all PP
The following options are used by the extractors:
extractor_retries: Number of times to retry for known errors
dynamic_mpd: Whether to process dynamic DASH manifests (default: True)
hls_split_discontinuity: Split HLS playlists to different formats at
discontinuities such as ad breaks (default: False)
@@ -406,6 +410,7 @@ class YoutubeDL(object):
_ies = []
_pps = {'beforedl': [], 'aftermove': [], 'normal': []}
__prepare_filename_warned = False
_first_webpage_request = True
_download_retcode = None
_num_downloads = None
_playlist_level = 0
@@ -420,6 +425,7 @@ class YoutubeDL(object):
self._ies_instances = {}
self._pps = {'beforedl': [], 'aftermove': [], 'normal': []}
self.__prepare_filename_warned = False
self._first_webpage_request = True
self._post_hooks = []
self._progress_hooks = []
self._download_retcode = 0
@@ -2036,6 +2042,7 @@ class YoutubeDL(object):
self.to_stdout(formatSeconds(info_dict['duration']))
print_mandatory('format')
if self.params.get('forcejson', False):
self.post_extract(info_dict)
self.to_stdout(json.dumps(info_dict))
def process_info(self, info_dict):
@@ -2059,6 +2066,7 @@ class YoutubeDL(object):
if self._match_entry(info_dict, incomplete=False) is not None:
return
self.post_extract(info_dict)
self._num_downloads += 1
info_dict = self.pre_process(info_dict)
@@ -2166,15 +2174,6 @@ class YoutubeDL(object):
else:
try:
dl(sub_filename, sub_info, subtitle=True)
'''
if self.params.get('sleep_interval_subtitles', False):
dl(sub_filename, sub_info)
else:
sub_data = ie._request_webpage(
sub_info['url'], info_dict['id'], note=False).read()
with io.open(encodeFilename(sub_filename), 'wb') as subfile:
subfile.write(sub_data)
'''
files_to_move[sub_filename] = sub_filename_final
except (ExtractorError, IOError, OSError, ValueError, compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
@@ -2501,6 +2500,7 @@ class YoutubeDL(object):
raise
else:
if self.params.get('dump_single_json', False):
self.post_extract(res)
self.to_stdout(json.dumps(res))
return self._download_retcode
@@ -2549,6 +2549,24 @@ class YoutubeDL(object):
del files_to_move[old_filename]
return files_to_move, infodict
@staticmethod
def post_extract(info_dict):
def actual_post_extract(info_dict):
if info_dict.get('_type') in ('playlist', 'multi_video'):
for video_dict in info_dict.get('entries', {}):
actual_post_extract(video_dict)
return
if '__post_extractor' not in info_dict:
return
post_extractor = info_dict['__post_extractor']
if post_extractor:
info_dict.update(post_extractor().items())
del info_dict['__post_extractor']
return
actual_post_extract(info_dict)
def pre_process(self, ie_info):
info = dict(ie_info)
for pp in self._pps['beforedl']:

View File

@@ -169,25 +169,33 @@ def _real_main(argv=None):
parser.error('max sleep interval must be greater than or equal to min sleep interval')
else:
opts.max_sleep_interval = opts.sleep_interval
if opts.sleep_interval_subtitles is not None:
if opts.sleep_interval_subtitles < 0:
parser.error('subtitles sleep interval must be positive or 0')
if opts.sleep_interval_requests is not None:
if opts.sleep_interval_requests < 0:
parser.error('requests sleep interval must be positive or 0')
if opts.ap_mso and opts.ap_mso not in MSO_INFO:
parser.error('Unsupported TV Provider, use --ap-list-mso to get a list of supported TV Providers')
if opts.overwrites:
# --yes-overwrites implies --no-continue
opts.continue_dl = False
def parse_retries(retries):
def parse_retries(retries, name=''):
if retries in ('inf', 'infinite'):
parsed_retries = float('inf')
else:
try:
parsed_retries = int(retries)
except (TypeError, ValueError):
parser.error('invalid retry count specified')
parser.error('invalid %sretry count specified' % name)
return parsed_retries
if opts.retries is not None:
opts.retries = parse_retries(opts.retries)
if opts.fragment_retries is not None:
opts.fragment_retries = parse_retries(opts.fragment_retries)
opts.fragment_retries = parse_retries(opts.fragment_retries, 'fragment ')
if opts.extractor_retries is not None:
opts.extractor_retries = parse_retries(opts.extractor_retries, 'extractor ')
if opts.buffersize is not None:
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
if numeric_buffersize is None:
@@ -262,6 +270,11 @@ def _real_main(argv=None):
any_printing = opts.print_json
download_archive_fn = expand_path(opts.download_archive) if opts.download_archive is not None else opts.download_archive
# If JSON is not printed anywhere, but comments are requested, save it to file
printing_json = opts.dumpjson or opts.print_json or opts.dump_single_json
if opts.getcomments and not printing_json:
opts.writeinfojson = True
def report_conflict(arg1, arg2):
write_string('WARNING: %s is ignored since %s was given\n' % (arg2, arg1), out=sys.stderr)
if opts.remuxvideo and opts.recodevideo:
@@ -447,6 +460,7 @@ def _real_main(argv=None):
'overwrites': opts.overwrites,
'retries': opts.retries,
'fragment_retries': opts.fragment_retries,
'extractor_retries': opts.extractor_retries,
'skip_unavailable_fragments': opts.skip_unavailable_fragments,
'keep_fragments': opts.keep_fragments,
'buffersize': opts.buffersize,
@@ -466,7 +480,7 @@ def _real_main(argv=None):
'updatetime': opts.updatetime,
'writedescription': opts.writedescription,
'writeannotations': opts.writeannotations,
'writeinfojson': opts.writeinfojson or opts.getcomments,
'writeinfojson': opts.writeinfojson,
'allow_playlist_files': opts.allow_playlist_files,
'getcomments': opts.getcomments,
'writethumbnail': opts.writethumbnail,
@@ -524,6 +538,7 @@ def _real_main(argv=None):
'fixup': opts.fixup,
'source_address': opts.source_address,
'call_home': opts.call_home,
'sleep_interval_requests': opts.sleep_interval_requests,
'sleep_interval': opts.sleep_interval,
'max_sleep_interval': opts.max_sleep_interval,
'sleep_interval_subtitles': opts.sleep_interval_subtitles,
@@ -541,7 +556,6 @@ def _real_main(argv=None):
'postprocessor_args': opts.postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
'config_location': opts.config_location,
'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,

View File

@@ -53,7 +53,7 @@ def get_suitable_downloader(info_dict, params={}, default=HttpFD):
external_downloader = params.get('external_downloader')
if external_downloader is not None:
ed = get_external_downloader(external_downloader)
if ed.can_download(info_dict):
if ed.can_download(info_dict, external_downloader):
return ed
if protocol.startswith('m3u8'):

View File

@@ -85,16 +85,16 @@ class ExternalFD(FileDownloader):
return self.params.get('external_downloader')
@classmethod
def available(cls):
return check_executable(cls.get_basename(), [cls.AVAILABLE_OPT])
def available(cls, path=None):
return check_executable(path or cls.get_basename(), [cls.AVAILABLE_OPT])
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in cls.SUPPORTED_PROTOCOLS
@classmethod
def can_download(cls, info_dict):
return cls.available() and cls.supports(info_dict)
def can_download(cls, info_dict, path=None):
return cls.available(path) and cls.supports(info_dict)
def _option(self, command_option, param):
return cli_option(self.params, command_option, param)
@@ -282,8 +282,8 @@ class Aria2cFD(ExternalFD):
class HttpieFD(ExternalFD):
@classmethod
def available(cls):
return check_executable('http', ['--version'])
def available(cls, path=None):
return check_executable(path or 'http', ['--version'])
def _make_cmd(self, tmpfilename, info_dict):
cmd = ['http', '--download', '--output', tmpfilename, info_dict['url']]
@@ -298,7 +298,7 @@ class FFmpegFD(ExternalFD):
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
@classmethod
def available(cls):
def available(cls, path=None): # path is ignored for ffmpeg
return FFmpegPostProcessor().available
def _call_downloader(self, tmpfilename, info_dict):
@@ -398,7 +398,10 @@ class FFmpegFD(ExternalFD):
args += ['-fs', compat_str(self._TEST_FILE_SIZE)]
if protocol in ('m3u8', 'm3u8_native'):
if self.params.get('hls_use_mpegts', False) or tmpfilename == '-':
use_mpegts = (tmpfilename == '-') or self.params.get('hls_use_mpegts')
if use_mpegts is None:
use_mpegts = info_dict.get('is_live')
if use_mpegts:
args += ['-f', 'mpegts']
else:
args += ['-f', 'mp4']

View File

@@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
from .brightcove import BrightcoveNewIE
from ..utils import extract_attributes
class BandaiChannelIE(BrightcoveNewIE):
IE_NAME = 'bandaichannel'
_VALID_URL = r'https?://(?:www\.)?b-ch\.com/titles/(?P<id>\d+/\d+)'
_TESTS = [{
'url': 'https://www.b-ch.com/titles/514/001',
'md5': 'a0f2d787baa5729bed71108257f613a4',
'info_dict': {
'id': '6128044564001',
'ext': 'mp4',
'title': 'メタルファイターMIKU 第1話',
'timestamp': 1580354056,
'uploader_id': '5797077852001',
'upload_date': '20200130',
'duration': 1387.733,
},
'params': {
'format': 'bestvideo',
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
attrs = extract_attributes(self._search_regex(
r'(<video-js[^>]+\bid="bcplayer"[^>]*>)', webpage, 'player'))
bc = self._download_json(
'https://pbifcd.b-ch.com/v1/playbackinfo/ST/70/' + attrs['data-info'],
video_id, headers={'X-API-KEY': attrs['data-auth'].strip()})['bc']
return self._parse_brightcove_metadata(bc, bc['id'])

View File

@@ -5,10 +5,15 @@ import itertools
import re
from .common import InfoExtractor
from ..compat import (
compat_etree_Element,
compat_HTTPError,
compat_urlparse,
)
from ..utils import (
ExtractorError,
clean_html,
dict_get,
ExtractorError,
float_or_none,
get_element_by_class,
int_or_none,
@@ -21,11 +26,6 @@ from ..utils import (
urlencode_postdata,
urljoin,
)
from ..compat import (
compat_etree_Element,
compat_HTTPError,
compat_urlparse,
)
class BBCCoUkIE(InfoExtractor):
@@ -793,6 +793,20 @@ class BBCIE(BBCCoUkIE):
'description': 'Learn English words and phrases from this story',
},
'add_ie': [BBCCoUkIE.ie_key()],
}, {
# BBC Reel
'url': 'https://www.bbc.com/reel/video/p07c6sb6/how-positive-thinking-is-harming-your-happiness',
'info_dict': {
'id': 'p07c6sb9',
'ext': 'mp4',
'title': 'How positive thinking is harming your happiness',
'alt_title': 'The downsides of positive thinking',
'description': 'md5:fad74b31da60d83b8265954ee42d85b4',
'duration': 235,
'thumbnail': r're:https?://.+/p07c9dsr.jpg',
'upload_date': '20190604',
'categories': ['Psychology'],
},
}]
@classmethod
@@ -929,7 +943,7 @@ class BBCIE(BBCCoUkIE):
else:
entry['title'] = info['title']
entry['formats'].extend(info['formats'])
except Exception as e:
except ExtractorError as e:
# Some playlist URL may fail with 500, at the same time
# the other one may work fine (e.g.
# http://www.bbc.com/turkce/haberler/2015/06/150615_telabyad_kentin_cogu)
@@ -980,6 +994,37 @@ class BBCIE(BBCCoUkIE):
'subtitles': subtitles,
}
# bbc reel (e.g. https://www.bbc.com/reel/video/p07c6sb6/how-positive-thinking-is-harming-your-happiness)
initial_data = self._parse_json(self._html_search_regex(
r'<script[^>]+id=(["\'])initial-data\1[^>]+data-json=(["\'])(?P<json>(?:(?!\2).)+)',
webpage, 'initial data', default='{}', group='json'), playlist_id, fatal=False)
if initial_data:
init_data = try_get(
initial_data, lambda x: x['initData']['items'][0], dict) or {}
smp_data = init_data.get('smpData') or {}
clip_data = try_get(smp_data, lambda x: x['items'][0], dict) or {}
version_id = clip_data.get('versionID')
if version_id:
title = smp_data['title']
formats, subtitles = self._download_media_selector(version_id)
self._sort_formats(formats)
image_url = smp_data.get('holdingImageURL')
display_date = init_data.get('displayDate')
topic_title = init_data.get('topicTitle')
return {
'id': version_id,
'title': title,
'formats': formats,
'alt_title': init_data.get('shortTitle'),
'thumbnail': image_url.replace('$recipe', 'raw') if image_url else None,
'description': smp_data.get('summary') or init_data.get('shortSummary'),
'upload_date': display_date.replace('-', '') if display_date else None,
'subtitles': subtitles,
'duration': int_or_none(clip_data.get('duration')),
'categories': [topic_title] if topic_title else None,
}
# Morph based embed (e.g. http://www.bbc.co.uk/sport/live/olympics/36895975)
# There are several setPayload calls may be present but the video
# seems to be always related to the first one
@@ -1041,7 +1086,7 @@ class BBCIE(BBCCoUkIE):
thumbnail = None
image_url = current_programme.get('image_url')
if image_url:
thumbnail = image_url.replace('{recipe}', '1920x1920')
thumbnail = image_url.replace('{recipe}', 'raw')
return {
'id': programme_id,
'title': title,

View File

@@ -138,6 +138,11 @@ class BiliBiliIE(InfoExtractor):
anime_id = mobj.group('anime_id')
page_id = mobj.group('page')
webpage = self._download_webpage(url, video_id)
headers = {
'Referer': url,
'Accept': '*/*'
}
headers.update(self.geo_verification_headers())
if 'anime/' not in url:
cid = self._search_regex(
@@ -155,12 +160,8 @@ class BiliBiliIE(InfoExtractor):
if 'no_bangumi_tip' not in smuggled_data:
self.to_screen('Downloading episode %s. To download all videos in anime %s, re-run yt-dlp with %s' % (
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
headers = {
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Referer': url
}
headers.update(self.geo_verification_headers())
headers['Content-Type'] = 'application/x-www-form-urlencoded; charset=UTF-8'
js = self._download_json(
'http://bangumi.bilibili.com/web_api/get_source', video_id,
data=urlencode_postdata({'episode_id': video_id}),
@@ -169,11 +170,6 @@ class BiliBiliIE(InfoExtractor):
self._report_error(js)
cid = js['result']['cid']
headers = {
'Referer': url
}
headers.update(self.geo_verification_headers())
entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
@@ -255,10 +251,6 @@ class BiliBiliIE(InfoExtractor):
info['uploader'] = self._html_search_meta(
'author', webpage, 'uploader', default=None)
comments = None
if self._downloader.params.get('getcomments', False):
comments = self._get_all_comment_pages(video_id)
raw_danmaku = self._get_raw_danmaku(video_id, cid)
raw_tags = self._get_tags(video_id)
@@ -266,11 +258,18 @@ class BiliBiliIE(InfoExtractor):
top_level_info = {
'raw_danmaku': raw_danmaku,
'comments': comments,
'comment_count': len(comments) if comments is not None else None,
'tags': tags,
'raw_tags': raw_tags,
}
if self._downloader.params.get('getcomments', False):
def get_comments():
comments = self._get_all_comment_pages(video_id)
return {
'comments': comments,
'comment_count': len(comments)
}
top_level_info['__post_extractor'] = get_comments
'''
# Requires https://github.com/m13253/danmaku2ass which is licenced under GPL3
@@ -555,6 +554,7 @@ class BilibiliAudioIE(BilibiliAudioBaseIE):
formats = [{
'url': play_data['cdns'][0],
'filesize': int_or_none(play_data.get('size')),
'vcodec': 'none'
}]
song = self._call_api('song/info', au_id)

View File

@@ -27,10 +27,10 @@ class CBSBaseIE(ThePlatformFeedIE):
class CBSIE(CBSBaseIE):
_VALID_URL = r'(?:cbs:|https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/video|colbertlateshow\.com/(?:video|podcasts))/)(?P<id>[\w-]+)'
_VALID_URL = r'(?:cbs:|https?://(?:www\.)?(?:(?:cbs\.com|paramountplus\.com)/shows/[^/]+/video|colbertlateshow\.com/(?:video|podcasts))/)(?P<id>[\w-]+)'
_TESTS = [{
'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
'url': 'https://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
'info_dict': {
'id': '_u7W953k6la293J7EPTd9oHkSPs6Xn6_',
'ext': 'mp4',
@@ -52,16 +52,19 @@ class CBSIE(CBSBaseIE):
}, {
'url': 'http://www.colbertlateshow.com/podcasts/dYSwjqPs_X1tvbV_P2FcPWRa_qT6akTC/in-the-bad-room-with-stephen/',
'only_matching': True,
}, {
'url': 'https://www.paramountplus.com/shows/star-trek-discovery/video/l5ANMH9wM7kxwV1qr4u1xn88XOhYMlZX/star-trek-discovery-the-vulcan-hello/',
'only_matching': True,
}]
def _extract_video_info(self, content_id, site='cbs', mpx_acc=2198311517):
items_data = self._download_xml(
'http://can.cbs.com/thunder/player/videoPlayerService.php',
'https://can.cbs.com/thunder/player/videoPlayerService.php',
content_id, query={'partner': site, 'contentId': content_id})
video_data = xpath_element(items_data, './/item')
title = xpath_text(video_data, 'videoTitle', 'title') or xpath_text(video_data, 'videotitle', 'title')
tp_path = 'dJ5BDC/media/guid/%d/%s' % (mpx_acc, content_id)
tp_release_url = 'http://link.theplatform.com/s/' + tp_path
tp_release_url = 'https://link.theplatform.com/s/' + tp_path
asset_types = []
subtitles = {}

View File

@@ -294,6 +294,14 @@ class InfoExtractor(object):
players on other sites. Can be True (=always allowed),
False (=never allowed), None (=unknown), or a string
specifying the criteria for embedability (Eg: 'whitelist').
__post_extractor: A function to be called just before the metadata is
written to either disk, logger or console. The function
must return a dict which will be added to the info_dict.
This is usefull for additional information that is
time-consuming to extract. Note that the fields thus
extracted will not be available to output template and
match_filter. So, only "comments" and "comment_count" are
currently allowed to be extracted via this method.
The following fields should only be used when the video belongs to some logical
chapter or section:
@@ -606,6 +614,14 @@ class InfoExtractor(object):
See _download_webpage docstring for arguments specification.
"""
if not self._downloader._first_webpage_request:
sleep_interval = float_or_none(self._downloader.params.get('sleep_interval_requests')) or 0
if sleep_interval > 0:
self.to_screen('Sleeping %s seconds ...' % sleep_interval)
time.sleep(sleep_interval)
else:
self._downloader._first_webpage_request = False
if note is None:
self.report_download_webpage(video_id)
elif note is not False:
@@ -1833,8 +1849,9 @@ class InfoExtractor(object):
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None, quality=None,
m3u8_id=None, live=False, note=None, errnote=None,
fatal=True, data=None, headers={}, query={}):
m3u8_id=None, note=None, errnote=None,
fatal=True, live=False, data=None, headers={},
query={}):
res = self._download_webpage_handle(
m3u8_url, video_id,
note=note or 'Downloading m3u8 information',
@@ -1888,13 +1905,16 @@ class InfoExtractor(object):
# media playlist and MUST NOT appear in master playlist thus we can
# clearly detect media playlist with this criterion.
def _extract_m3u8_playlist_formats(format_url, m3u8_doc=None):
def _extract_m3u8_playlist_formats(format_url=None, m3u8_doc=None, video_id=None,
fatal=True, data=None, headers={}):
if not m3u8_doc:
if not format_url:
return []
res = self._download_webpage_handle(
format_url, video_id,
note=False,
errnote=errnote or 'Failed to download m3u8 playlist information',
fatal=fatal, data=data, headers=headers, query=query)
errnote='Failed to download m3u8 playlist information',
fatal=fatal, data=data, headers=headers)
if res is False:
return []
@@ -1928,7 +1948,7 @@ class InfoExtractor(object):
if '#EXT-X-TARGETDURATION' in m3u8_doc: # media playlist, return as is
playlist_formats = _extract_m3u8_playlist_formats(m3u8_doc, True)
playlist_formats = _extract_m3u8_playlist_formats(m3u8_doc=m3u8_doc)
for format in playlist_formats:
format_id = []
@@ -1966,7 +1986,8 @@ class InfoExtractor(object):
if media_url:
manifest_url = format_url(media_url)
format_id = []
playlist_formats = _extract_m3u8_playlist_formats(manifest_url)
playlist_formats = _extract_m3u8_playlist_formats(manifest_url, video_id=video_id,
fatal=fatal, data=data, headers=headers)
for format in playlist_formats:
format_index = format.get('index')
@@ -2027,13 +2048,14 @@ class InfoExtractor(object):
or last_stream_inf.get('BANDWIDTH'), scale=1000)
manifest_url = format_url(line.strip())
playlist_formats = _extract_m3u8_playlist_formats(manifest_url)
playlist_formats = _extract_m3u8_playlist_formats(manifest_url, video_id=video_id,
fatal=fatal, data=data, headers=headers)
for format in playlist_formats:
for frmt in playlist_formats:
format_id = []
if m3u8_id:
format_id.append(m3u8_id)
format_index = format.get('index')
format_index = frmt.get('index')
stream_name = build_stream_name()
# Bandwidth of live streams may differ over time thus making
# format_id unpredictable. So it's better to keep provided
@@ -2088,6 +2110,8 @@ class InfoExtractor(object):
# TODO: update acodec for audio only formats with
# the same GROUP-ID
f['acodec'] = 'none'
if not f.get('ext'):
f['ext'] = 'm4a' if f.get('vcodec') == 'none' else 'mp4'
formats.append(f)
# for DailyMotion

View File

@@ -1,193 +1,43 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
xpath_text,
determine_ext,
float_or_none,
ExtractorError,
)
from .zdf import ZDFIE
class DreiSatIE(InfoExtractor):
class DreiSatIE(ZDFIE):
IE_NAME = '3sat'
_GEO_COUNTRIES = ['DE']
_VALID_URL = r'https?://(?:www\.)?3sat\.de/mediathek/(?:(?:index|mediathek)\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)'
_TESTS = [
{
'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918',
'md5': 'be37228896d30a88f315b638900a026e',
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
_TESTS = [{
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html',
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
'info_dict': {
'id': '45918',
'id': '141007_ab18_10wochensommer_film',
'ext': 'mp4',
'title': 'Ab 18! - 10 Wochen Sommer',
'description': 'md5:8253f41dc99ce2c3ff892dac2d65fe26',
'duration': 2660,
'timestamp': 1608604200,
'upload_date': '20201222',
},
}, {
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html',
'info_dict': {
'id': '140913_sendung_schweizweit',
'ext': 'mp4',
'title': 'Waidmannsheil',
'description': 'md5:cce00ca1d70e21425e72c86a98a56817',
'uploader': 'SCHWEIZWEIT',
'uploader_id': '100000210',
'timestamp': 1410623100,
'upload_date': '20140913'
},
'params': {
'skip_download': True, # m3u8 downloads
'skip_download': True,
}
},
{
'url': 'http://www.3sat.de/mediathek/mediathek.php?mode=play&obj=51066',
}, {
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
'only_matching': True,
},
]
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
param_groups = {}
for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
group_id = param_group.get(self._xpath_ns(
'id', 'http://www.w3.org/XML/1998/namespace'))
params = {}
for param in param_group:
params[param.get('name')] = param.get('value')
param_groups[group_id] = params
formats = []
for video in smil.findall(self._xpath_ns('.//video', namespace)):
src = video.get('src')
if not src:
continue
bitrate = int_or_none(self._search_regex(r'_(\d+)k', src, 'bitrate', None)) or float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
group_id = video.get('paramGroup')
param_group = param_groups[group_id]
for proto in param_group['protocols'].split(','):
formats.append({
'url': '%s://%s' % (proto, param_group['host']),
'app': param_group['app'],
'play_path': src,
'ext': 'flv',
'format_id': '%s-%d' % (proto, bitrate),
'tbr': bitrate,
})
self._sort_formats(formats)
return formats
def extract_from_xml_url(self, video_id, xml_url):
doc = self._download_xml(
xml_url, video_id,
note='Downloading video info',
errnote='Failed to download video info')
status_code = xpath_text(doc, './status/statuscode')
if status_code and status_code != 'ok':
if status_code == 'notVisibleAnymore':
message = 'Video %s is not available' % video_id
else:
message = '%s returned error: %s' % (self.IE_NAME, status_code)
raise ExtractorError(message, expected=True)
title = xpath_text(doc, './/information/title', 'title', True)
urls = []
formats = []
for fnode in doc.findall('.//formitaeten/formitaet'):
video_url = xpath_text(fnode, 'url')
if not video_url or video_url in urls:
continue
urls.append(video_url)
is_available = 'http://www.metafilegenerator' not in video_url
geoloced = 'static_geoloced_online' in video_url
if not is_available or geoloced:
continue
format_id = fnode.attrib['basetype']
format_m = re.match(r'''(?x)
(?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
(?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
''', format_id)
ext = determine_ext(video_url, None) or format_m.group('container')
if ext == 'meta':
continue
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
video_url, video_id, fatal=False))
elif ext == 'm3u8':
# the certificates are misconfigured (see
# https://github.com/ytdl-org/youtube-dl/issues/8665)
if video_url.startswith('https://'):
continue
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', 'm3u8_native',
m3u8_id=format_id, fatal=False))
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
video_url, video_id, f4m_id=format_id, fatal=False))
else:
quality = xpath_text(fnode, './quality')
if quality:
format_id += '-' + quality
abr = int_or_none(xpath_text(fnode, './audioBitrate'), 1000)
vbr = int_or_none(xpath_text(fnode, './videoBitrate'), 1000)
tbr = int_or_none(self._search_regex(
r'_(\d+)k', video_url, 'bitrate', None))
if tbr and vbr and not abr:
abr = tbr - vbr
formats.append({
'format_id': format_id,
'url': video_url,
'ext': ext,
'acodec': format_m.group('acodec'),
'vcodec': format_m.group('vcodec'),
'abr': abr,
'vbr': vbr,
'tbr': tbr,
'width': int_or_none(xpath_text(fnode, './width')),
'height': int_or_none(xpath_text(fnode, './height')),
'filesize': int_or_none(xpath_text(fnode, './filesize')),
'protocol': format_m.group('proto').lower(),
})
geolocation = xpath_text(doc, './/details/geolocation')
if not formats and geolocation and geolocation != 'none':
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
self._sort_formats(formats)
thumbnails = []
for node in doc.findall('.//teaserimages/teaserimage'):
thumbnail_url = node.text
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
thumbnail_key = node.get('key')
if thumbnail_key:
m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
upload_date = unified_strdate(xpath_text(doc, './/details/airtime'))
return {
'id': video_id,
'title': title,
'description': xpath_text(doc, './/information/detail'),
'duration': int_or_none(xpath_text(doc, './/details/lengthSec')),
'thumbnails': thumbnails,
'uploader': xpath_text(doc, './/details/originChannelTitle'),
'uploader_id': xpath_text(doc, './/details/originChannelId'),
'upload_date': upload_date,
'formats': formats,
}
def _real_extract(self, url):
video_id = self._match_id(url)
details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?id=%s' % video_id
return self.extract_from_xml_url(video_id, details_url)
}, {
# Same as https://www.zdf.de/wissen/nano/nano-21-mai-2019-102.html, equal media ids
'url': 'https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html',
'only_matching': True,
}]

View File

@@ -103,6 +103,7 @@ from .awaan import (
)
from .azmedien import AZMedienIE
from .baidu import BaiduVideoIE
from .bandaichannel import BandaiChannelIE
from .bandcamp import BandcampIE, BandcampAlbumIE, BandcampWeeklyIE
from .bbc import (
BBCCoUkIE,
@@ -449,10 +450,7 @@ from .gamestar import GameStarIE
from .gaskrank import GaskrankIE
from .gazeta import GazetaIE
from .gdcvault import GDCVaultIE
from .gedi import (
GediIE,
GediEmbedsIE,
)
from .gedidigital import GediDigitalIE
from .generic import GenericIE
from .gfycat import GfycatIE
from .giantbomb import GiantBombIE
@@ -737,6 +735,7 @@ from .mtv import (
)
from .muenchentv import MuenchenTVIE
from .mwave import MwaveIE, MwaveMeetGreetIE
from .mxplayer import MxplayerIE
from .mychannels import MyChannelsIE
from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE

View File

@@ -1,266 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
base_url,
url_basename,
urljoin,
)
class GediBaseIE(InfoExtractor):
@staticmethod
def _clean_audio_fmts(formats):
unique_formats = []
for f in formats:
if 'acodec' in f:
unique_formats.append(f)
formats[:] = unique_formats
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_data = re.findall(
r'PlayerFactory\.setParam\(\'(?P<type>.+?)\',\s*\'(?P<name>.+?)\',\s*\'(?P<val>.+?)\'\);',
webpage)
formats = []
audio_fmts = []
hls_fmts = []
http_fmts = []
title = ''
thumb = ''
fmt_reg = r'(?P<t>video|audio)-(?P<p>rrtv|hls)-(?P<h>[\w\d]+)(?:-(?P<br>[\w\d]+))?$'
br_reg = r'video-rrtv-(?P<br>\d+)-'
for t, n, v in player_data:
if t == 'format':
m = re.match(fmt_reg, n)
if m:
# audio formats
if m.group('t') == 'audio':
if m.group('p') == 'hls':
audio_fmts.extend(self._extract_m3u8_formats(
v, video_id, 'm4a', m3u8_id='hls', fatal=False))
elif m.group('p') == 'rrtv':
audio_fmts.append({
'format_id': 'mp3',
'url': v,
'tbr': 128,
'ext': 'mp3',
'vcodec': 'none',
'acodec': 'mp3',
})
# video formats
elif m.group('t') == 'video':
# hls manifest video
if m.group('p') == 'hls':
hls_fmts.extend(self._extract_m3u8_formats(
v, video_id, 'mp4', m3u8_id='hls', fatal=False))
# direct mp4 video
elif m.group('p') == 'rrtv':
if not m.group('br'):
mm = re.search(br_reg, v)
http_fmts.append({
'format_id': 'https-' + m.group('h'),
'protocol': 'https',
'url': v,
'tbr': int(m.group('br')) if m.group('br') else
(int(mm.group('br')) if mm.group('br') else 0),
'height': int(m.group('h'))
})
elif t == 'param':
if n == 'videotitle':
title = v
if n == 'image_full_play':
thumb = v
title = self._og_search_title(webpage) if title == '' else title
# clean weird char
title = compat_str(title).encode('utf8', 'replace').replace(b'\xc3\x82', b'').decode('utf8', 'replace')
if audio_fmts:
self._clean_audio_fmts(audio_fmts)
self._sort_formats(audio_fmts)
if hls_fmts:
self._sort_formats(hls_fmts)
if http_fmts:
self._sort_formats(http_fmts)
formats.extend(audio_fmts)
formats.extend(hls_fmts)
formats.extend(http_fmts)
return {
'id': video_id,
'title': title,
'description': self._html_search_meta('twitter:description', webpage),
'thumbnail': thumb,
'formats': formats,
}
class GediIE(GediBaseIE):
_VALID_URL = r'''(?x)https?://video\.
(?:
(?:espresso\.)?repubblica
|lastampa
|huffingtonpost
|ilsecoloxix
|iltirreno
|messaggeroveneto
|ilpiccolo
|gazzettadimantova
|mattinopadova
|laprovinciapavese
|tribunatreviso
|nuovavenezia
|gazzettadimodena
|lanuovaferrara
|corrierealpi
|lasentinella
)
(?:\.gelocal)?\.it/(?!embed/).+?/(?P<id>[\d/]+)(?:\?|\&|$)'''
_TESTS = [{
'url': 'https://video.lastampa.it/politica/il-paradosso-delle-regionali-la-lega-vince-ma-sembra-aver-perso/121559/121683',
'md5': '84658d7fb9e55a6e57ecc77b73137494',
'info_dict': {
'id': '121559/121683',
'ext': 'mp4',
'title': 'Il paradosso delle Regionali: ecco perché la Lega vince ma sembra aver perso',
'description': 'md5:de7f4d6eaaaf36c153b599b10f8ce7ca',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}, {
'url': 'https://video.repubblica.it/motori/record-della-pista-a-spa-francorchamps-la-pagani-huayra-roadster-bc-stupisce/367415/367963',
'md5': 'e763b94b7920799a0e0e23ffefa2d157',
'info_dict': {
'id': '367415/367963',
'ext': 'mp4',
'title': 'Record della pista a Spa Francorchamps, la Pagani Huayra Roadster BC stupisce',
'description': 'md5:5deb503cefe734a3eb3f07ed74303920',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}, {
'url': 'https://video.ilsecoloxix.it/sport/cassani-e-i-brividi-azzurri-ai-mondiali-di-imola-qui-mi-sono-innamorato-del-ciclismo-da-ragazzino-incredibile-tornarci-da-ct/66184/66267',
'md5': 'e48108e97b1af137d22a8469f2019057',
'info_dict': {
'id': '66184/66267',
'ext': 'mp4',
'title': 'Cassani e i brividi azzurri ai Mondiali di Imola: \\"Qui mi sono innamorato del ciclismo da ragazzino, incredibile tornarci da ct\\"',
'description': 'md5:fc9c50894f70a2469bb9b54d3d0a3d3b',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}, {
'url': 'https://video.iltirreno.gelocal.it/sport/dentro-la-notizia-ferrari-cosa-succede-a-maranello/141059/142723',
'md5': 'a6e39f3bdc1842bbd92abbbbef230817',
'info_dict': {
'id': '141059/142723',
'ext': 'mp4',
'title': 'Dentro la notizia - Ferrari, cosa succede a Maranello',
'description': 'md5:9907d65b53765681fa3a0b3122617c1f',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}]
class GediEmbedsIE(GediBaseIE):
_VALID_URL = r'''(?x)https?://video\.
(?:
(?:espresso\.)?repubblica
|lastampa
|huffingtonpost
|ilsecoloxix
|iltirreno
|messaggeroveneto
|ilpiccolo
|gazzettadimantova
|mattinopadova
|laprovinciapavese
|tribunatreviso
|nuovavenezia
|gazzettadimodena
|lanuovaferrara
|corrierealpi
|lasentinella
)
(?:\.gelocal)?\.it/embed/.+?/(?P<id>[\d/]+)(?:\?|\&|$)'''
_TESTS = [{
'url': 'https://video.huffingtonpost.it/embed/politica/cotticelli-non-so-cosa-mi-sia-successo-sto-cercando-di-capire-se-ho-avuto-un-malore/29312/29276?responsive=true&el=video971040871621586700',
'md5': 'f4ac23cadfea7fef89bea536583fa7ed',
'info_dict': {
'id': '29312/29276',
'ext': 'mp4',
'title': 'Cotticelli: \\"Non so cosa mi sia successo. Sto cercando di capire se ho avuto un malore\\"',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}, {
'url': 'https://video.espresso.repubblica.it/embed/tutti-i-video/01-ted-villa/14772/14870&width=640&height=360',
'md5': '0391c2c83c6506581003aaf0255889c0',
'info_dict': {
'id': '14772/14870',
'ext': 'mp4',
'title': 'Festival EMERGENCY, Villa: «La buona informazione aiuta la salute» (14772-14870)',
'description': 'md5:2bce954d278248f3c950be355b7c2226',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-social-play\.jpg$',
},
}]
@staticmethod
def _sanitize_urls(urls):
# add protocol if missing
for i, e in enumerate(urls):
if e.startswith('//'):
urls[i] = 'https:%s' % e
# clean iframes urls
for i, e in enumerate(urls):
urls[i] = urljoin(base_url(e), url_basename(e))
return urls
@staticmethod
def _extract_urls(webpage):
entries = [
mobj.group('url')
for mobj in re.finditer(r'''(?x)
(?:
data-frame-src=|
<iframe[^\n]+src=
)
(["'])
(?P<url>https?://video\.
(?:
(?:espresso\.)?repubblica
|lastampa
|huffingtonpost
|ilsecoloxix
|iltirreno
|messaggeroveneto
|ilpiccolo
|gazzettadimantova
|mattinopadova
|laprovinciapavese
|tribunatreviso
|nuovavenezia
|gazzettadimodena
|lanuovaferrara
|corrierealpi
|lasentinella
)
(?:\.gelocal)?\.it/embed/.+?)
\1''', webpage)]
return GediEmbedsIE._sanitize_urls(entries)
@staticmethod
def _extract_url(webpage):
urls = GediEmbedsIE._extract_urls(webpage)
return urls[0] if urls else None

View File

@@ -0,0 +1,210 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
base_url,
determine_ext,
int_or_none,
url_basename,
urljoin,
)
class GediDigitalIE(InfoExtractor):
_VALID_URL = r'''(?x)(?P<url>(?:https?:)//video\.
(?:
(?:
(?:espresso\.)?repubblica
|lastampa
|ilsecoloxix
|huffingtonpost
)|
(?:
iltirreno
|messaggeroveneto
|ilpiccolo
|gazzettadimantova
|mattinopadova
|laprovinciapavese
|tribunatreviso
|nuovavenezia
|gazzettadimodena
|lanuovaferrara
|corrierealpi
|lasentinella
)\.gelocal
)\.it(?:/[^/]+){2,4}/(?P<id>\d+))(?:$|[?&].*)'''
_TESTS = [{
'url': 'https://video.lastampa.it/politica/il-paradosso-delle-regionali-la-lega-vince-ma-sembra-aver-perso/121559/121683',
'md5': '84658d7fb9e55a6e57ecc77b73137494',
'info_dict': {
'id': '121683',
'ext': 'mp4',
'title': 'Il paradosso delle Regionali: ecco perché la Lega vince ma sembra aver perso',
'description': 'md5:de7f4d6eaaaf36c153b599b10f8ce7ca',
'thumbnail': r're:^https://www\.repstatic\.it/video/photo/.+?-thumb-full-.+?\.jpg$',
'duration': 125,
},
}, {
'url': 'https://video.huffingtonpost.it/embed/politica/cotticelli-non-so-cosa-mi-sia-successo-sto-cercando-di-capire-se-ho-avuto-un-malore/29312/29276?responsive=true&el=video971040871621586700',
'only_matching': True,
}, {
'url': 'https://video.espresso.repubblica.it/embed/tutti-i-video/01-ted-villa/14772/14870&width=640&height=360',
'only_matching': True,
}, {
'url': 'https://video.repubblica.it/motori/record-della-pista-a-spa-francorchamps-la-pagani-huayra-roadster-bc-stupisce/367415/367963',
'only_matching': True,
}, {
'url': 'https://video.ilsecoloxix.it/sport/cassani-e-i-brividi-azzurri-ai-mondiali-di-imola-qui-mi-sono-innamorato-del-ciclismo-da-ragazzino-incredibile-tornarci-da-ct/66184/66267',
'only_matching': True,
}, {
'url': 'https://video.iltirreno.gelocal.it/sport/dentro-la-notizia-ferrari-cosa-succede-a-maranello/141059/142723',
'only_matching': True,
}, {
'url': 'https://video.messaggeroveneto.gelocal.it/locale/maria-giovanna-elmi-covid-vaccino/138155/139268',
'only_matching': True,
}, {
'url': 'https://video.ilpiccolo.gelocal.it/dossier/big-john/dinosauro-big-john-al-via-le-visite-guidate-a-trieste/135226/135751',
'only_matching': True,
}, {
'url': 'https://video.gazzettadimantova.gelocal.it/locale/dal-ponte-visconteo-di-valeggio-l-and-8217sos-dei-ristoratori-aprire-anche-a-cena/137310/137818',
'only_matching': True,
}, {
'url': 'https://video.mattinopadova.gelocal.it/dossier/coronavirus-in-veneto/covid-a-vo-un-anno-dopo-un-cuore-tricolore-per-non-dimenticare/138402/138964',
'only_matching': True,
}, {
'url': 'https://video.laprovinciapavese.gelocal.it/locale/mede-zona-rossa-via-alle-vaccinazioni-per-gli-over-80/137545/138120',
'only_matching': True,
}, {
'url': 'https://video.tribunatreviso.gelocal.it/dossier/coronavirus-in-veneto/ecco-le-prima-vaccinazioni-di-massa-nella-marca/134485/135024',
'only_matching': True,
}, {
'url': 'https://video.nuovavenezia.gelocal.it/locale/camion-troppo-alto-per-il-ponte-ferroviario-perde-il-carico/135734/136266',
'only_matching': True,
}, {
'url': 'https://video.gazzettadimodena.gelocal.it/locale/modena-scoperta-la-proteina-che-predice-il-livello-di-gravita-del-covid/139109/139796',
'only_matching': True,
}, {
'url': 'https://video.lanuovaferrara.gelocal.it/locale/due-bombole-di-gpl-aperte-e-abbandonate-i-vigili-bruciano-il-gas/134391/134957',
'only_matching': True,
}, {
'url': 'https://video.corrierealpi.gelocal.it/dossier/cortina-2021-i-mondiali-di-sci-alpino/mondiali-di-sci-il-timelapse-sulla-splendida-olympia/133760/134331',
'only_matching': True,
}, {
'url': 'https://video.lasentinella.gelocal.it/locale/vestigne-centra-un-auto-e-si-ribalta/138931/139466',
'only_matching': True,
}, {
'url': 'https://video.espresso.repubblica.it/tutti-i-video/01-ted-villa/14772',
'only_matching': True,
}]
@staticmethod
def _sanitize_urls(urls):
# add protocol if missing
for i, e in enumerate(urls):
if e.startswith('//'):
urls[i] = 'https:%s' % e
# clean iframes urls
for i, e in enumerate(urls):
urls[i] = urljoin(base_url(e), url_basename(e))
return urls
@staticmethod
def _extract_urls(webpage):
entries = [
mobj.group('eurl')
for mobj in re.finditer(r'''(?x)
(?:
data-frame-src=|
<iframe[^\n]+src=
)
(["'])(?P<eurl>%s)\1''' % GediDigitalIE._VALID_URL, webpage)]
return GediDigitalIE._sanitize_urls(entries)
@staticmethod
def _extract_url(webpage):
urls = GediDigitalIE._extract_urls(webpage)
return urls[0] if urls else None
@staticmethod
def _clean_formats(formats):
format_urls = set()
clean_formats = []
for f in formats:
if f['url'] not in format_urls:
if f.get('audio_ext') != 'none' and not f.get('acodec'):
continue
format_urls.add(f['url'])
clean_formats.append(f)
formats[:] = clean_formats
def _real_extract(self, url):
video_id = self._match_id(url)
url = re.match(self._VALID_URL, url).group('url')
webpage = self._download_webpage(url, video_id)
title = self._html_search_meta(
['twitter:title', 'og:title'], webpage, fatal=True)
player_data = re.findall(
r"PlayerFactory\.setParam\('(?P<type>format|param)',\s*'(?P<name>[^']+)',\s*'(?P<val>[^']+)'\);",
webpage)
formats = []
duration = thumb = None
for t, n, v in player_data:
if t == 'format':
if n in ('video-hds-vod-ec', 'video-hls-vod-ec', 'video-viralize', 'video-youtube-pfp'):
continue
elif n.endswith('-vod-ak'):
formats.extend(self._extract_akamai_formats(
v, video_id, {'http': 'media.gedidigital.it'}))
else:
ext = determine_ext(v)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
v, video_id, 'mp4', 'm3u8_native', m3u8_id=n, fatal=False))
continue
f = {
'format_id': n,
'url': v,
}
if ext == 'mp3':
abr = int_or_none(self._search_regex(
r'-mp3-audio-(\d+)', v, 'abr', default=None))
f.update({
'abr': abr,
'tbr': abr,
'acodec': ext,
'vcodec': 'none'
})
else:
mobj = re.match(r'^video-rrtv-(\d+)(?:-(\d+))?$', n)
if mobj:
f.update({
'height': int(mobj.group(1)),
'vbr': int_or_none(mobj.group(2)),
})
if not f.get('vbr'):
f['vbr'] = int_or_none(self._search_regex(
r'-video-rrtv-(\d+)', v, 'abr', default=None))
formats.append(f)
elif t == 'param':
if n in ['image_full', 'image']:
thumb = v
elif n == 'videoDuration':
duration = int_or_none(v)
self._clean_formats(formats)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': self._html_search_meta(
['twitter:description', 'og:description', 'description'], webpage),
'thumbnail': thumb or self._og_search_thumbnail(webpage),
'formats': formats,
'duration': duration,
}

View File

@@ -127,7 +127,7 @@ from .expressen import ExpressenIE
from .zype import ZypeIE
from .odnoklassniki import OdnoklassnikiIE
from .kinja import KinjaEmbedIE
from .gedi import GediEmbedsIE
from .gedidigital import GediDigitalIE
from .rcs import RCSEmbedsIE
from .bitchute import BitChuteIE
from .rumble import RumbleEmbedIE
@@ -3339,12 +3339,12 @@ class GenericIE(InfoExtractor):
return self.playlist_from_matches(
zype_urls, video_id, video_title, ie=ZypeIE.ie_key())
# Look for RCS media group embeds
gedi_urls = GediEmbedsIE._extract_urls(webpage)
gedi_urls = GediDigitalIE._extract_urls(webpage)
if gedi_urls:
return self.playlist_from_matches(
gedi_urls, video_id, video_title, ie=GediEmbedsIE.ie_key())
gedi_urls, video_id, video_title, ie=GediDigitalIE.ie_key())
# Look for RCS media group embeds
rcs_urls = RCSEmbedsIE._extract_urls(webpage)
if rcs_urls:
return self.playlist_from_matches(

View File

@@ -4,7 +4,7 @@ from __future__ import unicode_literals
import json
import re
from yt_dlp.utils import int_or_none, unified_timestamp, unescapeHTML
from ..utils import int_or_none, unified_timestamp, unescapeHTML
from .common import InfoExtractor

View File

@@ -21,9 +21,9 @@ from ..utils import (
class LBRYBaseIE(InfoExtractor):
_BASE_URL_REGEX = r'https?://(?:www\.)?(?:lbry\.tv|odysee\.com)/'
_BASE_URL_REGEX = r'(?:https?://(?:www\.)?(?:lbry\.tv|odysee\.com)/|lbry://)'
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
_OPT_CLAIM_ID = '[^:/?#&]+(?::%s)?' % _CLAIM_ID_REGEX
_OPT_CLAIM_ID = '[^:/?#&]+(?:[:#]%s)?' % _CLAIM_ID_REGEX
_SUPPORTED_STREAM_TYPES = ['video', 'audio']
def _call_api_proxy(self, method, display_id, params, resource):
@@ -41,7 +41,9 @@ class LBRYBaseIE(InfoExtractor):
'resolve', display_id, {'urls': url}, resource)[url]
def _permanent_url(self, url, claim_name, claim_id):
return urljoin(url, '/%s:%s' % (claim_name, claim_id))
return urljoin(
url.replace('lbry://', 'https://lbry.tv/'),
'/%s:%s' % (claim_name, claim_id))
def _parse_stream(self, stream, url):
stream_value = stream.get('value') or {}
@@ -137,6 +139,9 @@ class LBRYIE(LBRYBaseIE):
}, {
'url': 'https://lbry.tv/@lacajadepandora:a/TRUMP-EST%C3%81-BIEN-PUESTO-con-Pilar-Baselga,-Carlos-Senra,-Luis-Palacios-(720p_30fps_H264-192kbit_AAC):1',
'only_matching': True,
}, {
'url': 'lbry://@lbry#3f/odysee#7',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -166,7 +171,7 @@ class LBRYIE(LBRYBaseIE):
class LBRYChannelIE(LBRYBaseIE):
IE_NAME = 'lbry:channel'
_VALID_URL = LBRYBaseIE._BASE_URL_REGEX + r'(?P<id>@%s)/?(?:[?#&]|$)' % LBRYBaseIE._OPT_CLAIM_ID
_VALID_URL = LBRYBaseIE._BASE_URL_REGEX + r'(?P<id>@%s)/?(?:[?&]|$)' % LBRYBaseIE._OPT_CLAIM_ID
_TESTS = [{
'url': 'https://lbry.tv/@LBRYFoundation:0',
'info_dict': {
@@ -178,6 +183,9 @@ class LBRYChannelIE(LBRYBaseIE):
}, {
'url': 'https://lbry.tv/@LBRYFoundation',
'only_matching': True,
}, {
'url': 'lbry://@lbry#3f',
'only_matching': True,
}]
_PAGE_SIZE = 50

View File

@@ -7,7 +7,6 @@ from .common import InfoExtractor
from ..compat import (
compat_str,
compat_xpath,
compat_urlparse,
)
from ..utils import (
ExtractorError,
@@ -23,7 +22,6 @@ from ..utils import (
unescapeHTML,
update_url_query,
url_basename,
get_domain,
xpath_text,
)
@@ -45,7 +43,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
# Remove the templates, like &device={device}
return re.sub(r'&[^=]*?={.*?}(?=(&|$))', '', url)
def _get_feed_url(self, uri, url=None):
def _get_feed_url(self, uri):
return self._FEED_URL
def _get_thumbnail_url(self, uri, itemdoc):
@@ -211,9 +209,9 @@ class MTVServicesInfoExtractor(InfoExtractor):
data['lang'] = self._LANG
return data
def _get_videos_info(self, uri, use_hls=True, url=None):
def _get_videos_info(self, uri, use_hls=True):
video_id = self._id_from_uri(uri)
feed_url = self._get_feed_url(uri, url)
feed_url = self._get_feed_url(uri)
info_url = update_url_query(feed_url, self._get_feed_query(uri))
return self._get_videos_info_from_url(info_url, video_id, use_hls)
@@ -259,41 +257,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
def _extract_child_with_type(parent, t):
return next(c for c in parent['children'] if c.get('type') == t)
def _extract_new_triforce_mgid(self, webpage, url='', video_id=None):
if url == '':
return
domain = get_domain(url)
if domain is None:
raise ExtractorError(
'[%s] could not get domain' % self.IE_NAME,
expected=True)
url = url.replace("https://", "http://")
enc_url = compat_urlparse.quote(url, safe='')
_TRIFORCE_V8_TEMPLATE = 'https://%s/feeds/triforce/manifest/v8?url=%s'
triforce_manifest_url = _TRIFORCE_V8_TEMPLATE % (domain, enc_url)
manifest = self._download_json(triforce_manifest_url, video_id, fatal=False)
if manifest:
if manifest.get('manifest').get('type') == 'redirect':
self.to_screen('Found a redirect. Downloading manifest from new location')
new_loc = manifest.get('manifest').get('newLocation')
new_loc = new_loc.replace("https://", "http://")
enc_new_loc = compat_urlparse.quote(new_loc, safe='')
triforce_manifest_new_loc = _TRIFORCE_V8_TEMPLATE % (domain, enc_new_loc)
manifest = self._download_json(triforce_manifest_new_loc, video_id, fatal=False)
item_id = try_get(manifest, lambda x: x['manifest']['reporting']['itemId'], compat_str)
if not item_id:
self.to_screen('No id found!')
return
# 'episode' can be anything. 'content' is used often as well
_MGID_TEMPLATE = 'mgid:arc:episode:%s:%s'
mgid = _MGID_TEMPLATE % (domain, item_id)
return mgid
def _extract_mgid(self, webpage, url, title=None, data_zone=None):
def _extract_mgid(self, webpage):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
@@ -304,21 +268,6 @@ class MTVServicesInfoExtractor(InfoExtractor):
except RegexNotFoundError:
mgid = None
if not title:
title = url_basename(url)
try:
window_data = self._parse_json(self._search_regex(
r'(?s)window.__DATA__ = (?P<json>{.+});', webpage,
'JSON Window Data', default=None, fatal=False, group='json'), title, fatal=False)
main_container = None
for i in range(len(window_data['children'])):
if window_data['children'][i]['type'] == 'MainContainer':
main_container = window_data['children'][i]
mgid = main_container['children'][0]['props']['media']['video']['config']['uri']
except (KeyError, IndexError, TypeError):
pass
if mgid is None or ':' not in mgid:
mgid = self._search_regex(
[r'data-mgid="(.*?)"', r'swfobject\.embedSWF\(".*?(mgid:.*?)"'],
@@ -331,10 +280,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=None)
if not mgid:
mgid = self._extract_new_triforce_mgid(webpage, url)
if not mgid:
mgid = self._extract_triforce_mgid(webpage, data_zone)
mgid = self._extract_triforce_mgid(webpage)
if not mgid:
data = self._parse_json(self._search_regex(
@@ -348,8 +294,8 @@ class MTVServicesInfoExtractor(InfoExtractor):
def _real_extract(self, url):
title = url_basename(url)
webpage = self._download_webpage(url, title)
mgid = self._extract_mgid(webpage, url, title=title)
videos_info = self._get_videos_info(mgid, url=url)
mgid = self._extract_mgid(webpage)
videos_info = self._get_videos_info(mgid)
return videos_info

View File

@@ -0,0 +1,127 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
js_to_json,
qualities,
try_get,
url_or_none,
urljoin,
)
class MxplayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mxplayer\.in/(?:show|movie)/(?:(?P<display_id>[-/a-z0-9]+)-)?(?P<id>[a-z0-9]+)'
_TESTS = [{
'url': 'https://www.mxplayer.in/movie/watch-knock-knock-hindi-dubbed-movie-online-b9fa28df3bfb8758874735bbd7d2655a?watch=true',
'info_dict': {
'id': 'b9fa28df3bfb8758874735bbd7d2655a',
'ext': 'mp4',
'title': 'Knock Knock (Hindi Dubbed)',
'description': 'md5:b195ba93ff1987309cfa58e2839d2a5b'
},
'params': {
'skip_download': True,
'format': 'bestvideo'
}
}, {
'url': 'https://www.mxplayer.in/show/watch-shaitaan/season-1/the-infamous-taxi-gang-of-meerut-online-45055d5bcff169ad48f2ad7552a83d6c',
'info_dict': {
'id': '45055d5bcff169ad48f2ad7552a83d6c',
'ext': 'm3u8',
'title': 'The infamous taxi gang of Meerut',
'description': 'md5:033a0a7e3fd147be4fb7e07a01a3dc28',
'season': 'Season 1',
'series': 'Shaitaan'
},
'params': {
'skip_download': True,
}
}, {
'url': 'https://www.mxplayer.in/show/watch-aashram/chapter-1/duh-swapna-online-d445579792b0135598ba1bc9088a84cb',
'info_dict': {
'id': 'd445579792b0135598ba1bc9088a84cb',
'ext': 'mp4',
'title': 'Duh Swapna',
'description': 'md5:35ff39c4bdac403c53be1e16a04192d8',
'season': 'Chapter 1',
'series': 'Aashram'
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
'params': {
'skip_download': True,
'format': 'bestvideo'
}
}]
def _get_stream_urls(self, video_dict):
stream_provider_dict = try_get(
video_dict,
lambda x: x['stream'][x['stream']['provider']])
if not stream_provider_dict:
raise ExtractorError('No stream provider found', expected=True)
for stream_name, stream in stream_provider_dict.items():
if stream_name in ('hls', 'dash', 'hlsUrl', 'dashUrl'):
stream_type = stream_name.replace('Url', '')
if isinstance(stream, dict):
for quality, stream_url in stream.items():
if stream_url:
yield stream_type, quality, stream_url
else:
yield stream_type, 'base', stream
def _real_extract(self, url):
display_id, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id)
source = self._parse_json(
js_to_json(self._html_search_regex(
r'(?s)<script>window\.state\s*[:=]\s(\{.+\})\n(\w+).*(</script>).*',
webpage, 'WindowState')),
video_id)
if not source:
raise ExtractorError('Cannot find source', expected=True)
config_dict = source['config']
video_dict = source['entities'][video_id]
thumbnails = []
for i in video_dict.get('imageInfo') or []:
thumbnails.append({
'url': urljoin(config_dict['imageBaseUrl'], i['url']),
'width': i['width'],
'height': i['height'],
})
formats = []
get_quality = qualities(['main', 'base', 'high'])
for stream_type, quality, stream_url in self._get_stream_urls(video_dict):
format_url = url_or_none(urljoin(config_dict['videoCdnBaseUrl'], stream_url))
if not format_url:
continue
if stream_type == 'dash':
dash_formats = self._extract_mpd_formats(
format_url, video_id, mpd_id='dash-%s' % quality, headers={'Referer': url})
for frmt in dash_formats:
frmt['quality'] = get_quality(quality)
formats.extend(dash_formats)
elif stream_type == 'hls':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, fatal=False,
m3u8_id='hls-%s' % quality, quality=get_quality(quality)))
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id.replace('/', '-'),
'title': video_dict['title'] or self._og_search_title(webpage),
'formats': formats,
'description': video_dict.get('description'),
'season': try_get(video_dict, lambda x: x['container']['title']),
'series': try_get(video_dict, lambda x: x['container']['container']['title']),
'thumbnails': thumbnails,
}

View File

@@ -8,59 +8,66 @@ from ..utils import update_url_query
class NickIE(MTVServicesInfoExtractor):
# None of videos on the website are still alive?
IE_NAME = 'nick.com'
_VALID_URL = r'https?://(?P<domain>(?:(?:www|beta)\.)?nick(?:jr)?\.com)/(?:[^/]+/)?(?:videos/clip|[^/]+/videos)/(?P<id>[^/?#.]+)'
_VALID_URL = r'https?://(?P<domain>(?:www\.)?nick(?:jr)?\.com)/(?:[^/]+/)?(?P<type>videos/clip|[^/]+/videos|episodes/[^/]+)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://udat.mtvnservices.com/service1/dispatch.htm'
_GEO_COUNTRIES = ['US']
_TESTS = [{
'url': 'http://www.nick.com/videos/clip/alvinnn-and-the-chipmunks-112-full-episode.html',
'url': 'https://www.nick.com/episodes/sq47rw/spongebob-squarepants-a-place-for-pets-lockdown-for-love-season-13-ep-1',
'info_dict': {
'description': 'md5:0650a9eb88955609d5c1d1c79292e234',
'title': 'A Place for Pets/Lockdown for Love',
},
'playlist': [
{
'md5': '6e5adc1e28253bbb1b28ab05403dd4d4',
'md5': 'cb8a2afeafb7ae154aca5a64815ec9d6',
'info_dict': {
'id': 'be6a17b0-412d-11e5-8ff7-0026b9414f30',
'id': '85ee8177-d6ce-48f8-9eee-a65364f8a6df',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S1',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S1',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
}
},
{
'md5': 'd7be441fc53a1d4882fa9508a1e5b3ce',
'md5': '839a04f49900a1fcbf517020d94e0737',
'info_dict': {
'id': 'be6b8f96-412d-11e5-8ff7-0026b9414f30',
'id': '2e2a9960-8fd4-411d-868b-28eb1beb7fae',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S2',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S2',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
}
},
{
'md5': 'efffe1728a234b2b0d2f2b343dd1946f',
'md5': 'f1145699f199770e2919ee8646955d46',
'info_dict': {
'id': 'be6cf7e6-412d-11e5-8ff7-0026b9414f30',
'id': 'dc91c304-6876-40f7-84a6-7aece7baa9d0',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S3',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S3',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
}
},
{
'md5': '1ec6690733ab9f41709e274a1d5c7556',
'md5': 'd463116875aee2585ee58de3b12caebd',
'info_dict': {
'id': 'be6e3354-412d-11e5-8ff7-0026b9414f30',
'id': '5d929486-cf4c-42a1-889a-6e0d183a101a',
'ext': 'mp4',
'title': 'ALVINNN!!! and The Chipmunks: "Mojo Missing/Who\'s The Animal" S4',
'description': 'Alvin is convinced his mojo was in a cap he gave to a fan, and must find a way to get his hat back before the Chipmunks big concert.\nDuring a costume visit to the zoo, Alvin finds himself mistaken for the real Tasmanian devil.',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S4',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
}
},
],
}, {
'url': 'http://www.nickjr.com/paw-patrol/videos/pups-save-a-goldrush-s3-ep302-full-episode/',
'only_matching': True,
}, {
'url': 'http://beta.nick.com/nicky-ricky-dicky-and-dawn/videos/nicky-ricky-dicky-dawn-301-full-episode/',
'only_matching': True,
'url': 'http://www.nickjr.com/blues-clues-and-you/videos/blues-clues-and-you-original-209-imagination-station/',
'info_dict': {
'id': '31631529-2fc5-430b-b2ef-6a74b4609abd',
'ext': 'mp4',
'description': 'md5:9d65a66df38e02254852794b2809d1cf',
'title': 'Blue\'s Imagination Station',
},
}]
def _get_feed_query(self, uri):
@@ -69,8 +76,14 @@ class NickIE(MTVServicesInfoExtractor):
'mgid': uri,
}
def _extract_mgid(self, webpage):
mgid = self._search_regex(r'"media":{"video":{"config":{"uri":"(mgid:.*?)"', webpage, 'mgid', default=None)
return mgid
def _real_extract(self, url):
domain, display_id = re.match(self._VALID_URL, url).groups()
domain, video_type, display_id = re.match(self._VALID_URL, url).groups()
if video_type.startswith("episodes"):
return super()._real_extract(url)
video_data = self._download_json(
'http://%s/data/video.endLevel.json' % domain,
display_id, query={

View File

@@ -23,11 +23,9 @@ class NineCNineMediaIE(InfoExtractor):
destination_code, content_id = re.match(self._VALID_URL, url).groups()
api_base_url = self._API_BASE_TEMPLATE % (destination_code, content_id)
content = self._download_json(api_base_url, content_id, query={
'$include': '[Media,Season,ContentPackages]',
'$include': '[Media.Name,Season,ContentPackages.Duration,ContentPackages.Id]',
})
title = content['Name']
if len(content['ContentPackages']) > 1:
raise ExtractorError('multiple content packages')
content_package = content['ContentPackages'][0]
package_id = content_package['Id']
content_package_url = api_base_url + 'contentpackages/%s/' % package_id

View File

@@ -1,52 +1,128 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import ExtractorError
import re
from .youtube import YoutubeIE
from .zdf import ZDFBaseIE
from ..compat import compat_str
from ..utils import (
int_or_none,
merge_dicts,
unified_timestamp,
xpath_text,
)
class PhoenixIE(InfoExtractor):
class PhoenixIE(ZDFBaseIE):
IE_NAME = 'phoenix.de'
_VALID_URL = r'''https?://(?:www\.)?phoenix.de/\D+(?P<id>\d+)\.html'''
_TESTS = [
{
'url': 'https://www.phoenix.de/sendungen/dokumentationen/unsere-welt-in-zukunft---stadt-a-1283620.html',
'md5': '5e765e838aa3531c745a4f5b249ee3e3',
_VALID_URL = r'https?://(?:www\.)?phoenix\.de/(?:[^/]+/)*[^/?#&]*-a-(?P<id>\d+)\.html'
_TESTS = [{
# Same as https://www.zdf.de/politik/phoenix-sendungen/wohin-fuehrt-der-protest-in-der-pandemie-100.html
'url': 'https://www.phoenix.de/sendungen/ereignisse/corona-nachgehakt/wohin-fuehrt-der-protest-in-der-pandemie-a-2050630.html',
'md5': '34ec321e7eb34231fd88616c65c92db0',
'info_dict': {
'id': '0OB4HFc43Ns',
'id': '210222_phx_nachgehakt_corona_protest',
'ext': 'mp4',
'title': 'Unsere Welt in Zukunft - Stadt',
'description': 'md5:9bfb6fd498814538f953b2dcad7ce044',
'upload_date': '20190912',
'title': 'Wohin führt der Protest in der Pandemie?',
'description': 'md5:7d643fe7f565e53a24aac036b2122fbd',
'duration': 1691,
'timestamp': 1613906100,
'upload_date': '20210221',
'uploader': 'Phoenix',
'channel': 'corona nachgehakt',
},
}, {
# Youtube embed
'url': 'https://www.phoenix.de/sendungen/gespraeche/phoenix-streitgut-brennglas-corona-a-1965505.html',
'info_dict': {
'id': 'hMQtqFYjomk',
'ext': 'mp4',
'title': 'phoenix streitgut: Brennglas Corona - Wie gerecht ist unsere Gesellschaft?',
'description': 'md5:ac7a02e2eb3cb17600bc372e4ab28fdd',
'duration': 3509,
'upload_date': '20201219',
'uploader': 'phoenix',
'uploader_id': 'phoenix',
}
},
{
'url': 'https://www.phoenix.de/drohnenangriffe-in-saudi-arabien-a-1286995.html?ref=aktuelles',
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.phoenix.de/entwicklungen-in-russland-a-2044720.html',
'only_matching': True,
},
# an older page: https://www.phoenix.de/sendungen/gespraeche/phoenix-persoenlich/im-dialog-a-177727.html
# seems to not have an embedded video, even though it's uploaded on youtube: https://www.youtube.com/watch?v=4GxnoUHvOkM
]
def extract_from_json_api(self, video_id, api_url):
doc = self._download_json(
api_url, video_id,
note="Downloading webpage metadata",
errnote="Failed to load webpage metadata")
for a in doc["absaetze"]:
if a["typ"] == "video-youtube":
return {
'_type': 'url_transparent',
'id': a["id"],
'title': doc["titel"],
'url': "https://www.youtube.com/watch?v=%s" % a["id"],
'ie_key': 'Youtube',
}
raise ExtractorError("No downloadable video found", expected=True)
}, {
# no media
'url': 'https://www.phoenix.de/sendungen/dokumentationen/mit-dem-jumbo-durch-die-nacht-a-89625.html',
'only_matching': True,
}, {
# Same as https://www.zdf.de/politik/phoenix-sendungen/die-gesten-der-maechtigen-100.html
'url': 'https://www.phoenix.de/sendungen/dokumentationen/gesten-der-maechtigen-i-a-89468.html?ref=suche',
'only_matching': True,
}]
def _real_extract(self, url):
page_id = self._match_id(url)
api_url = 'https://www.phoenix.de/response/id/%s' % page_id
return self.extract_from_json_api(page_id, api_url)
article_id = self._match_id(url)
article = self._download_json(
'https://www.phoenix.de/response/id/%s' % article_id, article_id,
'Downloading article JSON')
video = article['absaetze'][0]
title = video.get('titel') or article.get('subtitel')
if video.get('typ') == 'video-youtube':
video_id = video['id']
return self.url_result(
video_id, ie=YoutubeIE.ie_key(), video_id=video_id,
video_title=title)
video_id = compat_str(video.get('basename') or video.get('content'))
details = self._download_xml(
'https://www.phoenix.de/php/mediaplayer/data/beitrags_details.php',
video_id, 'Downloading details XML', query={
'ak': 'web',
'ptmd': 'true',
'id': video_id,
'profile': 'player2',
})
title = title or xpath_text(
details, './/information/title', 'title', fatal=True)
content_id = xpath_text(
details, './/video/details/basename', 'content id', fatal=True)
info = self._extract_ptmd(
'https://tmd.phoenix.de/tmd/2/ngplayer_2_3/vod/ptmd/phoenix/%s' % content_id,
content_id, None, url)
timestamp = unified_timestamp(xpath_text(details, './/details/airtime'))
thumbnails = []
for node in details.findall('.//teaserimages/teaserimage'):
thumbnail_url = node.text
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
thumbnail_key = node.get('key')
if thumbnail_key:
m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
return merge_dicts(info, {
'id': content_id,
'title': title,
'description': xpath_text(details, './/information/detail'),
'duration': int_or_none(xpath_text(details, './/details/lengthSec')),
'thumbnails': thumbnails,
'timestamp': timestamp,
'uploader': xpath_text(details, './/details/channel'),
'uploader_id': xpath_text(details, './/details/originChannelId'),
'channel': xpath_text(details, './/details/originChannelTitle'),
})

View File

@@ -158,6 +158,10 @@ class RaiPlayIE(RaiBaseIE):
# subtitles at 'subtitlesArray' key (see #27698)
'url': 'https://www.raiplay.it/video/2020/12/Report---04-01-2021-2e90f1de-8eee-4de4-ac0e-78d21db5b600.html',
'only_matching': True,
}, {
# DRM protected
'url': 'https://www.raiplay.it/video/2020/09/Lo-straordinario-mondo-di-Zoey-S1E1-Lo-straordinario-potere-di-Zoey-ed493918-1d32-44b7-8454-862e473d00ff.html',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -166,6 +170,14 @@ class RaiPlayIE(RaiBaseIE):
media = self._download_json(
base + '.json', video_id, 'Downloading video JSON')
if not self.params.get('allow_unplayable_formats'):
if try_get(
media,
(lambda x: x['rights_management']['rights']['drm'],
lambda x: x['program_info']['rights_management']['rights']['drm']),
dict):
raise ExtractorError('This video is DRM protected.', expected=True)
title = media['name']
video = media['video']

View File

@@ -15,17 +15,17 @@ class RDSIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rds\.ca/vid(?:[eé]|%C3%A9)os/(?:[^/]+/)*(?P<id>[^/]+)-\d+\.\d+'
_TESTS = [{
'url': 'http://www.rds.ca/videos/football/nfl/fowler-jr-prend-la-direction-de-jacksonville-3.1132799',
# has two 9c9media ContentPackages, the web player selects the first ContentPackage
'url': 'https://www.rds.ca/videos/Hockey/NationalHockeyLeague/teams/9/forum-du-5-a-7-jesperi-kotkaniemi-de-retour-de-finlande-3.1377606',
'info_dict': {
'id': '604333',
'display_id': 'fowler-jr-prend-la-direction-de-jacksonville',
'id': '2083309',
'display_id': 'forum-du-5-a-7-jesperi-kotkaniemi-de-retour-de-finlande',
'ext': 'flv',
'title': 'Fowler Jr. prend la direction de Jacksonville',
'description': 'Dante Fowler Jr. est le troisième choix du repêchage 2015 de la NFL. ',
'timestamp': 1430397346,
'upload_date': '20150430',
'duration': 154.354,
'age_limit': 0,
'title': 'Forum du 5 à 7 : Kotkaniemi de retour de Finlande',
'description': 'md5:83fa38ecc4a79b19e433433254077f25',
'timestamp': 1606129030,
'upload_date': '20201123',
'duration': 773.039,
}
}, {
'url': 'http://www.rds.ca/vid%C3%A9os/un-voyage-positif-3.877934',

View File

@@ -6,11 +6,12 @@ import re
from .srgssr import SRGSSRIE
from ..compat import compat_str
from ..utils import (
determine_ext,
int_or_none,
parse_duration,
parse_iso8601,
unescapeHTML,
determine_ext,
urljoin,
)
@@ -21,7 +22,7 @@ class RTSIE(SRGSSRIE):
_TESTS = [
{
'url': 'http://www.rts.ch/archives/tv/divers/3449373-les-enfants-terribles.html',
'md5': 'ff7f8450a90cf58dacb64e29707b4a8e',
'md5': '753b877968ad8afaeddccc374d4256a5',
'info_dict': {
'id': '3449373',
'display_id': 'les-enfants-terribles',
@@ -35,6 +36,7 @@ class RTSIE(SRGSSRIE):
'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
},
{
'url': 'http://www.rts.ch/emissions/passe-moi-les-jumelles/5624067-entre-ciel-et-mer.html',
@@ -63,11 +65,12 @@ class RTSIE(SRGSSRIE):
# m3u8 download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
'skip': 'Blocked outside Switzerland',
},
{
'url': 'http://www.rts.ch/video/info/journal-continu/5745356-londres-cachee-par-un-epais-smog.html',
'md5': '1bae984fe7b1f78e94abc74e802ed99f',
'md5': '9bb06503773c07ce83d3cbd793cebb91',
'info_dict': {
'id': '5745356',
'display_id': 'londres-cachee-par-un-epais-smog',
@@ -81,6 +84,7 @@ class RTSIE(SRGSSRIE):
'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
},
{
'url': 'http://www.rts.ch/audio/couleur3/programmes/la-belle-video-de-stephane-laurenceau/5706148-urban-hippie-de-damien-krisl-03-04-2014.html',
@@ -160,7 +164,7 @@ class RTSIE(SRGSSRIE):
media_type = 'video' if 'video' in all_info else 'audio'
# check for errors
self.get_media_data('rts', media_type, media_id)
self._get_media_data('rts', media_type, media_id)
info = all_info['video']['JSONinfo'] if 'video' in all_info else all_info['audio']
@@ -194,6 +198,7 @@ class RTSIE(SRGSSRIE):
'tbr': extract_bitrate(format_url),
})
download_base = 'http://rtsww%s-d.rts.ch/' % ('-a' if media_type == 'audio' else '')
for media in info.get('media', []):
media_url = media.get('url')
if not media_url or re.match(r'https?://', media_url):
@@ -205,7 +210,7 @@ class RTSIE(SRGSSRIE):
format_id += '-%dk' % rate
formats.append({
'format_id': format_id,
'url': 'http://download-video.rts.ch/' + media_url,
'url': urljoin(download_base, media_url),
'tbr': rate or extract_bitrate(media_url),
})

View File

@@ -4,16 +4,32 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
parse_iso8601,
qualities,
try_get,
)
class SRGSSRIE(InfoExtractor):
_VALID_URL = r'(?:https?://tp\.srgssr\.ch/p(?:/[^/]+)+\?urn=urn|srgssr):(?P<bu>srf|rts|rsi|rtr|swi):(?:[^:]+:)?(?P<type>video|audio):(?P<id>[0-9a-f\-]{36}|\d+)'
_VALID_URL = r'''(?x)
(?:
https?://tp\.srgssr\.ch/p(?:/[^/]+)+\?urn=urn|
srgssr
):
(?P<bu>
srf|rts|rsi|rtr|swi
):(?:[^:]+:)?
(?P<type>
video|audio
):
(?P<id>
[0-9a-f\-]{36}|\d+
)
'''
_GEO_BYPASS = False
_GEO_COUNTRIES = ['CH']
@@ -25,25 +41,39 @@ class SRGSSRIE(InfoExtractor):
'LEGAL': 'The video cannot be transmitted for legal reasons.',
'STARTDATE': 'This video is not yet available. Please try again later.',
}
_DEFAULT_LANGUAGE_CODES = {
'srf': 'de',
'rts': 'fr',
'rsi': 'it',
'rtr': 'rm',
'swi': 'en',
}
def _get_tokenized_src(self, url, video_id, format_id):
sp = compat_urllib_parse_urlparse(url).path.split('/')
token = self._download_json(
'http://tp.srgssr.ch/akahd/token?acl=/%s/%s/*' % (sp[1], sp[2]),
'http://tp.srgssr.ch/akahd/token?acl=*',
video_id, 'Downloading %s token' % format_id, fatal=False) or {}
auth_params = token.get('token', {}).get('authparams')
auth_params = try_get(token, lambda x: x['token']['authparams'])
if auth_params:
url += '?' + auth_params
url += ('?' if '?' not in url else '&') + auth_params
return url
def get_media_data(self, bu, media_type, media_id):
media_data = self._download_json(
'http://il.srgssr.ch/integrationlayer/1.0/ue/%s/%s/play/%s.json' % (bu, media_type, media_id),
media_id)[media_type.capitalize()]
def _get_media_data(self, bu, media_type, media_id):
query = {'onlyChapters': True} if media_type == 'video' else {}
full_media_data = self._download_json(
'https://il.srgssr.ch/integrationlayer/2.0/%s/mediaComposition/%s/%s.json'
% (bu, media_type, media_id),
media_id, query=query)['chapterList']
try:
media_data = next(
x for x in full_media_data if x.get('id') == media_id)
except StopIteration:
raise ExtractorError('No media information found')
if media_data.get('block') and media_data['block'] in self._ERRORS:
message = self._ERRORS[media_data['block']]
if media_data['block'] == 'GEOBLOCK':
block_reason = media_data.get('blockReason')
if block_reason and block_reason in self._ERRORS:
message = self._ERRORS[block_reason]
if block_reason == 'GEOBLOCK':
self.raise_geo_restricted(
msg=message, countries=self._GEO_COUNTRIES)
raise ExtractorError(
@@ -53,53 +83,75 @@ class SRGSSRIE(InfoExtractor):
def _real_extract(self, url):
bu, media_type, media_id = re.match(self._VALID_URL, url).groups()
media_data = self._get_media_data(bu, media_type, media_id)
title = media_data['title']
media_data = self.get_media_data(bu, media_type, media_id)
metadata = media_data['AssetMetadatas']['AssetMetadata'][0]
title = metadata['title']
description = metadata.get('description')
created_date = media_data.get('createdDate') or metadata.get('createdDate')
timestamp = parse_iso8601(created_date)
thumbnails = [{
'id': image.get('id'),
'url': image['url'],
} for image in media_data.get('Image', {}).get('ImageRepresentations', {}).get('ImageRepresentation', [])]
preference = qualities(['LQ', 'MQ', 'SD', 'HQ', 'HD'])
formats = []
for source in media_data.get('Playlists', {}).get('Playlist', []) + media_data.get('Downloads', {}).get('Download', []):
protocol = source.get('@protocol')
for asset in source['url']:
asset_url = asset['text']
quality = asset['@quality']
format_id = '%s-%s' % (protocol, quality)
if protocol.startswith('HTTP-HDS') or protocol.startswith('HTTP-HLS'):
asset_url = self._get_tokenized_src(asset_url, media_id, format_id)
if protocol.startswith('HTTP-HDS'):
formats.extend(self._extract_f4m_formats(
asset_url + ('?' if '?' not in asset_url else '&') + 'hdcore=3.4.0',
media_id, f4m_id=format_id, fatal=False))
elif protocol.startswith('HTTP-HLS'):
q = qualities(['SD', 'HD'])
for source in (media_data.get('resourceList') or []):
format_url = source.get('url')
if not format_url:
continue
protocol = source.get('protocol')
quality = source.get('quality')
format_id = []
for e in (protocol, source.get('encoding'), quality):
if e:
format_id.append(e)
format_id = '-'.join(format_id)
if protocol in ('HDS', 'HLS'):
if source.get('tokenType') == 'AKAMAI':
format_url = self._get_tokenized_src(
format_url, media_id, format_id)
formats.extend(self._extract_akamai_formats(
format_url, media_id))
elif protocol == 'HLS':
formats.extend(self._extract_m3u8_formats(
asset_url, media_id, 'mp4', 'm3u8_native',
format_url, media_id, 'mp4', 'm3u8_native',
m3u8_id=format_id, fatal=False))
else:
elif protocol in ('HTTP', 'HTTPS'):
formats.append({
'format_id': format_id,
'url': asset_url,
'quality': preference(quality),
'ext': 'flv' if protocol == 'RTMP' else None,
'url': format_url,
'quality': q(quality),
})
# This is needed because for audio medias the podcast url is usually
# always included, even if is only an audio segment and not the
# whole episode.
if int_or_none(media_data.get('position')) == 0:
for p in ('S', 'H'):
podcast_url = media_data.get('podcast%sdUrl' % p)
if not podcast_url:
continue
quality = p + 'D'
formats.append({
'format_id': 'PODCAST-' + quality,
'url': podcast_url,
'quality': q(quality),
})
self._sort_formats(formats)
subtitles = {}
if media_type == 'video':
for sub in (media_data.get('subtitleList') or []):
sub_url = sub.get('url')
if not sub_url:
continue
lang = sub.get('locale') or self._DEFAULT_LANGUAGE_CODES[bu]
subtitles.setdefault(lang, []).append({
'url': sub_url,
})
return {
'id': media_id,
'title': title,
'description': description,
'timestamp': timestamp,
'thumbnails': thumbnails,
'description': media_data.get('description'),
'timestamp': parse_iso8601(media_data.get('date')),
'thumbnail': media_data.get('imageUrl'),
'duration': float_or_none(media_data.get('duration'), 1000),
'subtitles': subtitles,
'formats': formats,
}
@@ -119,26 +171,17 @@ class SRGSSRPlayIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.srf.ch/play/tv/10vor10/video/snowden-beantragt-asyl-in-russland?id=28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'md5': 'da6b5b3ac9fa4761a942331cef20fcb3',
'md5': '6db2226ba97f62ad42ce09783680046c',
'info_dict': {
'id': '28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'ext': 'mp4',
'upload_date': '20130701',
'title': 'Snowden beantragt Asyl in Russland',
'timestamp': 1372713995,
}
}, {
# No Speichern (Save) button
'url': 'http://www.srf.ch/play/tv/top-gear/video/jaguar-xk120-shadow-und-tornado-dampflokomotive?id=677f5829-e473-4823-ac83-a1087fe97faa',
'md5': '0a274ce38fda48c53c01890651985bc6',
'info_dict': {
'id': '677f5829-e473-4823-ac83-a1087fe97faa',
'ext': 'flv',
'upload_date': '20130710',
'title': 'Jaguar XK120, Shadow und Tornado-Dampflokomotive',
'description': 'md5:88604432b60d5a38787f152dec89cd56',
'timestamp': 1373493600,
'timestamp': 1372708215,
'duration': 113.827,
'thumbnail': r're:^https?://.*1383719781\.png$',
},
'expected_warnings': ['Unable to download f4m manifest'],
}, {
'url': 'http://www.rtr.ch/play/radio/actualitad/audio/saira-tujetsch-tuttina-cuntinuar-cun-sedrun-muster-turissem?id=63cb0778-27f8-49af-9284-8c7a8c6d15fc',
'info_dict': {
@@ -146,7 +189,8 @@ class SRGSSRPlayIE(InfoExtractor):
'ext': 'mp3',
'upload_date': '20151013',
'title': 'Saira: Tujetsch - tuttina cuntinuar cun Sedrun Mustér Turissem',
'timestamp': 1444750398,
'timestamp': 1444709160,
'duration': 336.816,
},
'params': {
# rtmp download
@@ -159,19 +203,32 @@ class SRGSSRPlayIE(InfoExtractor):
'id': '6348260',
'display_id': '6348260',
'ext': 'mp4',
'duration': 1796,
'duration': 1796.76,
'title': 'Le 19h30',
'description': '',
'uploader': '19h30',
'upload_date': '20141201',
'timestamp': 1417458600,
'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://play.swissinfo.ch/play/tv/business/video/why-people-were-against-tax-reforms?id=42960270',
'info_dict': {
'id': '42960270',
'ext': 'mp4',
'title': 'Why people were against tax reforms',
'description': 'md5:7ac442c558e9630e947427469c4b824d',
'duration': 94.0,
'upload_date': '20170215',
'timestamp': 1487173560,
'thumbnail': r're:https?://www\.swissinfo\.ch/srgscalableimage/42961964',
'subtitles': 'count:9',
},
'params': {
'skip_download': True,
}
}, {
'url': 'https://www.srf.ch/play/tv/popupvideoplayer?id=c4dba0ca-e75b-43b2-a34f-f708a4932e01',
'only_matching': True,
@@ -181,6 +238,10 @@ class SRGSSRPlayIE(InfoExtractor):
}, {
'url': 'https://www.rts.ch/play/tv/19h30/video/le-19h30?urn=urn:rts:video:6348260',
'only_matching': True,
}, {
# audio segment, has podcastSdUrl of the full episode
'url': 'https://www.srf.ch/play/radio/popupaudioplayer?id=50b20dc8-f05b-4972-bf03-e438ff2833eb',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -188,5 +249,4 @@ class SRGSSRPlayIE(InfoExtractor):
bu = mobj.group('bu')
media_type = mobj.group('type') or mobj.group('type_2')
media_id = mobj.group('id')
# other info can be extracted from url + '&layout=json'
return self.url_result('srgssr:%s:%s:%s' % (bu[:3], media_type, media_id), 'SRGSSR')

View File

@@ -1,7 +1,6 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
class StretchInternetIE(InfoExtractor):
@@ -11,22 +10,28 @@ class StretchInternetIE(InfoExtractor):
'info_dict': {
'id': '573272',
'ext': 'mp4',
'title': 'University of Mary Wrestling vs. Upper Iowa',
'timestamp': 1575668361,
'upload_date': '20191206',
'title': 'UNIVERSITY OF MARY WRESTLING VS UPPER IOWA',
# 'timestamp': 1575668361,
# 'upload_date': '20191206',
'uploader_id': '99997',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
media_url = self._download_json(
'https://core.stretchlive.com/trinity/event/tcg/' + video_id,
video_id)[0]['media'][0]['url']
event = self._download_json(
'https://api.stretchinternet.com/trinity/event/tcg/' + video_id,
video_id)[0]
'https://neo-client.stretchinternet.com/portal-ws/getEvent.json',
video_id, query={'eventID': video_id, 'token': 'asdf'})['event']
return {
'id': video_id,
'title': event['title'],
'timestamp': int_or_none(event.get('dateCreated'), 1000),
'url': 'https://' + event['media'][0]['url'],
# TODO: parse US timezone abbreviations
# 'timestamp': event.get('dateTimeString'),
'url': 'https://' + media_url,
'uploader_id': event.get('ownerID'),
}

View File

@@ -86,6 +86,7 @@ class TennisTVIE(InfoExtractor):
'https://www.tennistv.com/api/users/v1/entitlementchecknondiva',
video_id, note='Checking video authorization', headers=headers, data=check_json)
formats = self._extract_m3u8_formats(check_result['contentUrl'], video_id, ext='mp4')
self._sort_formats(formats)
vdata = self._download_json(
'https://www.tennistv.com/api/en/v2/none/common/video/%s' % video_id,

View File

@@ -14,6 +14,7 @@ from ..utils import (
class TrovoBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?trovo\.live/'
_HEADERS = {'Origin': 'https://trovo.live'}
def _extract_streamer_info(self, data):
streamer_info = data.get('streamerInfo') or {}
@@ -68,6 +69,7 @@ class TrovoIE(TrovoBaseIE):
'format_id': format_id,
'height': int_or_none(format_id[:-1]) if format_id else None,
'url': play_url,
'http_headers': self._HEADERS,
})
self._sort_formats(formats)
@@ -153,6 +155,7 @@ class TrovoVodIE(TrovoBaseIE):
'protocol': 'm3u8_native',
'tbr': int_or_none(play_info.get('bitrate')),
'url': play_url,
'http_headers': self._HEADERS,
})
self._sort_formats(formats)

View File

@@ -21,6 +21,11 @@ class URPlayIE(InfoExtractor):
'description': 'md5:5344508a52aa78c1ced6c1b8b9e44e9a',
'timestamp': 1513292400,
'upload_date': '20171214',
'series': 'UR Samtiden - Livet, universum och rymdens märkliga musik',
'duration': 2269,
'categories': ['Kultur & historia'],
'tags': ['Kritiskt tänkande', 'Vetenskap', 'Vetenskaplig verksamhet'],
'episode': 'Om vetenskap, kritiskt tänkande och motstånd',
},
}, {
'url': 'https://urskola.se/Produkter/190031-Tripp-Trapp-Trad-Sovkudde',
@@ -31,6 +36,10 @@ class URPlayIE(InfoExtractor):
'description': 'md5:b86bffdae04a7e9379d1d7e5947df1d1',
'timestamp': 1440086400,
'upload_date': '20150820',
'series': 'Tripp, Trapp, Träd',
'duration': 865,
'tags': ['Sova'],
'episode': 'Sovkudde',
},
}, {
'url': 'http://urskola.se/Produkter/155794-Smasagor-meankieli-Grodan-i-vida-varlden',
@@ -41,9 +50,11 @@ class URPlayIE(InfoExtractor):
video_id = self._match_id(url)
url = url.replace('skola.se/Produkter', 'play.se/program')
webpage = self._download_webpage(url, video_id)
urplayer_data = self._parse_json(self._html_search_regex(
vid = int(video_id)
accessible_episodes = self._parse_json(self._html_search_regex(
r'data-react-class="routes/Product/components/ProgramContainer/ProgramContainer"[^>]+data-react-props="({.+?})"',
webpage, 'urplayer data'), video_id)['accessibleEpisodes'][0]
webpage, 'urplayer data'), video_id)['accessibleEpisodes']
urplayer_data = next(e for e in accessible_episodes if e.get('id') == vid)
episode = urplayer_data['title']
host = self._download_json('http://streaming-loadbalancer.ur.se/loadbalancer.json', video_id)['redirect']

View File

@@ -255,15 +255,8 @@ class VikiIE(VikiBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
resp = self._download_json(
'https://www.viki.com/api/videos/' + video_id,
video_id, 'Downloading video JSON', headers={
'x-client-user-agent': std_headers['User-Agent'],
'x-viki-as-id': self._APP,
'x-viki-app-ver': self._APP_VERSION,
})
video = resp['video']
video = self._call_api(
'videos/%s.json' % video_id, video_id, 'Downloading video JSON')
self._check_errors(video)
title = self.dict_selection(video.get('titles', {}), 'en', allow_fallback=False)
@@ -286,24 +279,6 @@ class VikiIE(VikiBaseIE):
})
subtitles = {}
try:
# New way to fetch subtitles
new_video = self._download_json(
'https://www.viki.com/api/videos/%s' % video_id, video_id,
'Downloading new video JSON to get subtitles', fatal=False,
headers={
'x-client-user-agent': std_headers['User-Agent'],
'x-viki-as-id': self._APP,
'x-viki-app-ver': self._APP_VERSION,
})
for sub in new_video.get('streamSubtitles').get('dash'):
subtitles[sub.get('srclang')] = [{
'ext': 'vtt',
'url': sub.get('src'),
'completion': sub.get('percentage'),
}]
except AttributeError:
# fall-back to the old way if there isn't a streamSubtitles attribute
for subtitle_lang, _ in (video.get('subtitle_completions') or {}).items():
subtitles[subtitle_lang] = [{
'ext': subtitles_format,
@@ -386,9 +361,6 @@ class VikiIE(VikiBaseIE):
'filesize': int_or_none(urlh.headers.get('Content-Length')),
})
for format_id, format_dict in (resp.get('streams') or {}).items():
add_format(format_id, format_dict)
if not formats:
streams = self._call_api(
'videos/%s/streams.json' % video_id, video_id,
'Downloading video streams JSON')

View File

@@ -498,6 +498,24 @@ class VimeoIE(VimeoBaseInfoExtractor):
'url': 'https://vimeo.com/album/2632481/video/79010983',
'only_matching': True,
},
{
'url': 'https://vimeo.com/showcase/3253534/video/119195465',
'note': 'A video in a password protected album (showcase)',
'info_dict': {
'id': '119195465',
'ext': 'mp4',
'title': 'youtube-dl test video \'ä"BaW_jenozKc',
'uploader': 'Philipp Hagemeister',
'uploader_id': 'user20132939',
'description': 'md5:fa7b6c6d8db0bdc353893df2f111855b',
'upload_date': '20150209',
'timestamp': 1423518307,
},
'params': {
'format': 'best[protocol=https]',
'videopassword': 'youtube-dl',
},
},
{
# source file returns 403: Forbidden
'url': 'https://vimeo.com/7809605',
@@ -564,6 +582,44 @@ class VimeoIE(VimeoBaseInfoExtractor):
def _real_initialize(self):
self._login()
def _try_album_password(self, url):
album_id = self._search_regex(
r'vimeo\.com/(?:album|showcase)/([^/]+)', url, 'album id', default=None)
if not album_id:
return
viewer = self._download_json(
'https://vimeo.com/_rv/viewer', album_id, fatal=False)
if not viewer:
webpage = self._download_webpage(url, album_id)
viewer = self._parse_json(self._search_regex(
r'bootstrap_data\s*=\s*({.+?})</script>',
webpage, 'bootstrap data'), album_id)['viewer']
jwt = viewer['jwt']
album = self._download_json(
'https://api.vimeo.com/albums/' + album_id,
album_id, headers={'Authorization': 'jwt ' + jwt},
query={'fields': 'description,name,privacy'})
if try_get(album, lambda x: x['privacy']['view']) == 'password':
password = self._downloader.params.get('videopassword')
if not password:
raise ExtractorError(
'This album is protected by a password, use the --video-password option',
expected=True)
self._set_vimeo_cookie('vuid', viewer['vuid'])
try:
self._download_json(
'https://vimeo.com/showcase/%s/auth' % album_id,
album_id, 'Verifying the password', data=urlencode_postdata({
'password': password,
'token': viewer['xsrft'],
}), headers={
'X-Requested-With': 'XMLHttpRequest',
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
raise ExtractorError('Wrong password', expected=True)
raise
def _real_extract(self, url):
url, data = unsmuggle_url(url, {})
headers = std_headers.copy()
@@ -591,6 +647,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
elif any(p in url for p in ('play_redirect_hls', 'moogaloop.swf')):
url = 'https://vimeo.com/' + video_id
self._try_album_password(url)
try:
# Retrieve video webpage to extract further information
webpage, urlh = self._download_webpage_handle(

View File

@@ -75,12 +75,15 @@ class VVVVIDIE(InfoExtractor):
'https://www.vvvvid.it/user/login',
None, headers=self.geo_verification_headers())['data']['conn_id']
def _download_info(self, show_id, path, video_id, fatal=True):
def _download_info(self, show_id, path, video_id, fatal=True, query=None):
q = {
'conn_id': self._conn_id,
}
if query:
q.update(query)
response = self._download_json(
'https://www.vvvvid.it/vvvvid/ondemand/%s/%s' % (show_id, path),
video_id, headers=self.geo_verification_headers(), query={
'conn_id': self._conn_id,
}, fatal=fatal)
video_id, headers=self.geo_verification_headers(), query=q, fatal=fatal)
if not (response or fatal):
return
if response.get('result') == 'error':
@@ -98,7 +101,8 @@ class VVVVIDIE(InfoExtractor):
show_id, season_id, video_id = re.match(self._VALID_URL, url).groups()
response = self._download_info(
show_id, 'season/%s' % season_id, video_id)
show_id, 'season/%s' % season_id,
video_id, query={'video_id': video_id})
vid = int(video_id)
video_data = list(filter(
@@ -247,9 +251,13 @@ class VVVVIDShowIE(VVVVIDIE):
show_info = self._download_info(
show_id, 'info/', show_title, fatal=False)
if not show_title:
base_url += "/title"
entries = []
for season in (seasons or []):
episodes = season.get('episodes') or []
playlist_title = season.get('name') or show_info.get('title')
for episode in episodes:
if episode.get('playable') is False:
continue
@@ -259,12 +267,13 @@ class VVVVIDShowIE(VVVVIDIE):
continue
info = self._extract_common_video_info(episode)
info.update({
'_type': 'url',
'_type': 'url_transparent',
'ie_key': VVVVIDIE.ie_key(),
'url': '/'.join([base_url, season_id, video_id]),
'title': episode.get('title'),
'description': episode.get('description'),
'season_id': season_id,
'playlist_title': playlist_title,
})
entries.append(info)

View File

@@ -2,6 +2,7 @@
from __future__ import unicode_literals
import hashlib
import itertools
import json
import os.path
@@ -25,6 +26,7 @@ from ..compat import (
from ..jsinterp import JSInterpreter
from ..utils import (
clean_html,
dict_get,
ExtractorError,
format_field,
float_or_none,
@@ -58,9 +60,9 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_TFA_URL = 'https://accounts.google.com/_/signin/challenge?hl=en&TL={0}'
_RESERVED_NAMES = (
r'embed|e|watch_popup|channel|c|user|playlist|watch|w|v|movies|results|shared|hashtag|'
r'storefront|oops|index|account|reporthistory|t/terms|about|upload|signin|logout|'
r'feed/(?:watch_later|history|subscriptions|library|trending|recommended)')
r'channel|c|user|playlist|watch|w|v|embed|e|watch_popup|'
r'movies|results|shared|hashtag|trending|feed|feeds|'
r'storefront|oops|index|account|reporthistory|t/terms|about|upload|signin|logout')
_NETRC_MACHINE = 'youtube'
# If True it will raise an error if no login info is provided
@@ -274,7 +276,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
'context': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20201021.03.00',
'clientVersion': '2.20210301.08.00',
}
},
}
@@ -283,15 +285,27 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_YT_INITIAL_PLAYER_RESPONSE_RE = r'ytInitialPlayerResponse\s*=\s*({.+?})\s*;'
_YT_INITIAL_BOUNDARY_RE = r'(?:var\s+meta|</script|\n)'
def _call_api(self, ep, query, video_id, fatal=True):
def _generate_sapisidhash_header(self):
sapisid_cookie = self._get_cookies('https://www.youtube.com').get('SAPISID')
if sapisid_cookie is None:
return
time_now = round(time.time())
sapisidhash = hashlib.sha1((str(time_now) + " " + sapisid_cookie.value + " " + "https://www.youtube.com").encode("utf-8")).hexdigest()
return "SAPISIDHASH %s_%s" % (time_now, sapisidhash)
def _call_api(self, ep, query, video_id, fatal=True, headers=None,
note='Downloading API JSON', errnote='Unable to download API page'):
data = self._DEFAULT_API_DATA.copy()
data.update(query)
headers = headers or {}
headers.update({'content-type': 'application/json'})
auth = self._generate_sapisidhash_header()
if auth is not None:
headers.update({'Authorization': auth, 'X-Origin': 'https://www.youtube.com'})
return self._download_json(
'https://www.youtube.com/youtubei/v1/%s' % ep, video_id=video_id,
note='Downloading API JSON', errnote='Unable to download API page',
data=json.dumps(data).encode('utf8'), fatal=fatal,
headers={'content-type': 'application/json'},
'https://www.youtube.com/youtubei/v1/%s' % ep,
video_id=video_id, fatal=fatal, note=note, errnote=errnote,
data=json.dumps(data).encode('utf8'), headers=headers,
query={'key': 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'})
def _extract_yt_initial_data(self, video_id, webpage):
@@ -1452,8 +1466,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url)
base_url = self.http_scheme() + '//www.youtube.com/'
webpage_url = base_url + 'watch?v=' + video_id + '&has_verified=1&bpctr=9999999999'
webpage = self._download_webpage(webpage_url, video_id, fatal=False)
webpage_url = base_url + 'watch?v=' + video_id
webpage = self._download_webpage(
webpage_url + '&has_verified=1&bpctr=9999999999',
video_id, fatal=False)
player_response = None
if webpage:
@@ -2010,9 +2026,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Get comments
# TODO: Refactor and move to seperate function
if get_comments:
def extract_comments():
expected_video_comment_count = 0
video_comments = []
comment_xsrf = xsrf_token
def find_value(html, key, num_chars=2, separator='"'):
pos_begin = html.find(key) + len(key) + num_chars
@@ -2081,7 +2098,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self.to_screen('Downloading comments')
while continuations:
continuation = continuations.pop()
comment_response = get_continuation(continuation, xsrf_token)
comment_response = get_continuation(continuation, comment_xsrf)
if not comment_response:
continue
if list(search_dict(comment_response, 'externalErrorMessage')):
@@ -2092,7 +2109,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
continue
# not sure if this actually helps
if 'xsrf_token' in comment_response:
xsrf_token = comment_response['xsrf_token']
comment_xsrf = comment_response['xsrf_token']
item_section = comment_response['response']['continuationContents']['itemSectionContinuation']
if first_continuation:
@@ -2121,7 +2138,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
while reply_continuations:
time.sleep(1)
continuation = reply_continuations.pop()
replies_data = get_continuation(continuation, xsrf_token, True)
replies_data = get_continuation(continuation, comment_xsrf, True)
if not replies_data or 'continuationContents' not in replies_data[1]['response']:
continue
@@ -2150,10 +2167,13 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
time.sleep(1)
self.to_screen('Total comments downloaded: %d of ~%d' % (len(video_comments), expected_video_comment_count))
info.update({
return {
'comments': video_comments,
'comment_count': expected_video_comment_count
})
}
if get_comments:
info['__post_extractor'] = extract_comments
self.mark_watched(video_id, player_response)
@@ -2500,17 +2520,22 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
channel_url, 'channel id')
@staticmethod
def _extract_grid_item_renderer(item):
for item_kind in ('Playlist', 'Video', 'Channel'):
renderer = item.get('grid%sRenderer' % item_kind)
if renderer:
def _extract_basic_item_renderer(item):
# Modified from _extract_grid_item_renderer
known_renderers = (
'playlistRenderer', 'videoRenderer', 'channelRenderer'
'gridPlaylistRenderer', 'gridVideoRenderer', 'gridChannelRenderer'
)
for key, renderer in item.items():
if key not in known_renderers:
continue
return renderer
def _grid_entries(self, grid_renderer):
for item in grid_renderer['items']:
if not isinstance(item, dict):
continue
renderer = self._extract_grid_item_renderer(item)
renderer = self._extract_basic_item_renderer(item)
if not isinstance(renderer, dict):
continue
title = try_get(
@@ -2539,7 +2564,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
content = shelf_renderer.get('content')
if not isinstance(content, dict):
return
renderer = content.get('gridRenderer')
renderer = content.get('gridRenderer') or content.get('expandedShelfContentsRenderer')
if renderer:
# TODO: add support for nested playlists so each shelf is processed
# as separate playlist
@@ -2581,20 +2606,6 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
continue
yield self._extract_video(renderer)
r""" # Not needed in the new implementation
def _itemSection_entries(self, item_sect_renderer):
for content in item_sect_renderer['contents']:
if not isinstance(content, dict):
continue
renderer = content.get('videoRenderer', {})
if not isinstance(renderer, dict):
continue
video_id = renderer.get('videoId')
if not video_id:
continue
yield self._extract_video(renderer)
"""
def _rich_entries(self, rich_grid_renderer):
renderer = try_get(
rich_grid_renderer, lambda x: x['content']['videoRenderer'], dict) or {}
@@ -2693,7 +2704,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
ctp = continuation_ep.get('clickTrackingParams')
return YoutubeTabIE._build_continuation_query(continuation, ctp)
def _entries(self, tab, identity_token):
def _entries(self, tab, item_id, identity_token, account_syncid):
def extract_entries(parent_renderer): # this needs to called again for continuation to work with feeds
contents = try_get(parent_renderer, lambda x: x['contents'], list) or []
@@ -2753,30 +2764,51 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
if identity_token:
headers['x-youtube-identity-token'] = identity_token
if account_syncid:
headers['X-Goog-PageId'] = account_syncid
headers['X-Goog-AuthUser'] = 0
for page_num in itertools.count(1):
if not continuation:
break
count = 0
retries = 3
while count <= retries:
try:
# Downloading page may result in intermittent 5xx HTTP error
# that is usually worked around with a retry
browse = self._download_json(
'https://www.youtube.com/browse_ajax', None,
'Downloading page %d%s'
% (page_num, ' (retry #%d)' % count if count else ''),
headers=headers, query=continuation)
break
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (500, 503):
retries = self._downloader.params.get('extractor_retries', 3)
count = -1
last_error = None
while count < retries:
count += 1
if count <= retries:
if last_error:
self.report_warning('%s. Retrying ...' % last_error)
try:
response = self._call_api(
ep="browse", fatal=True, headers=headers,
video_id='%s page %s' % (item_id, page_num),
query={
'continuation': continuation['continuation'],
'clickTracking': {'clickTrackingParams': continuation['itct']},
},
note='Downloading API JSON%s' % (' (retry #%d)' % count if count else ''))
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (500, 503, 404):
# Downloading page may result in intermittent 5xx HTTP error
# Sometimes a 404 is also recieved. See: https://github.com/ytdl-org/youtube-dl/issues/28289
last_error = 'HTTP Error %s' % e.cause.code
if count < retries:
continue
raise
if not browse:
else:
# Youtube sometimes sends incomplete data
# See: https://github.com/ytdl-org/youtube-dl/issues/28194
if dict_get(response,
('continuationContents', 'onResponseReceivedActions', 'onResponseReceivedEndpoints')):
break
response = try_get(browse, lambda x: x[1]['response'], dict)
# Youtube may send alerts if there was an issue with the continuation page
self._extract_alerts(response, expected=False)
last_error = 'Incomplete data received'
if count >= retries:
self._downloader.report_error(last_error)
if not response:
break
@@ -2805,11 +2837,13 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
'gridPlaylistRenderer': (self._grid_entries, 'items'),
'gridVideoRenderer': (self._grid_entries, 'items'),
'playlistVideoRenderer': (self._playlist_entries, 'contents'),
'itemSectionRenderer': (self._playlist_entries, 'contents'),
'itemSectionRenderer': (extract_entries, 'contents'), # for feeds
'richItemRenderer': (extract_entries, 'contents'), # for hashtag
'backstagePostThreadRenderer': (self._post_thread_continuation_entries, 'contents')
}
continuation_items = try_get(
response, lambda x: x['onResponseReceivedActions'][0]['appendContinuationItemsAction']['continuationItems'], list)
response,
lambda x: dict_get(x, ('onResponseReceivedActions', 'onResponseReceivedEndpoints'))[0]['appendContinuationItemsAction']['continuationItems'], list)
continuation_item = try_get(continuation_items, lambda x: x[0], dict) or {}
video_items_renderer = None
for key, value in continuation_item.items():
@@ -2856,7 +2890,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
try_get(owner, lambda x: x['navigationEndpoint']['browseEndpoint']['canonicalBaseUrl'], compat_str))
return {k: v for k, v in uploader.items() if v is not None}
def _extract_from_tabs(self, item_id, webpage, data, tabs, identity_token):
def _extract_from_tabs(self, item_id, webpage, data, tabs):
playlist_id = title = description = channel_url = channel_name = channel_id = None
thumbnails_list = tags = []
@@ -2920,16 +2954,41 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
'channel_id': metadata['uploader_id'],
'channel_url': metadata['uploader_url']})
return self.playlist_result(
self._entries(selected_tab, identity_token),
self._entries(
selected_tab, playlist_id,
self._extract_identity_token(webpage, item_id),
self._extract_account_syncid(data)),
**metadata)
def _extract_mix_playlist(self, playlist, playlist_id):
first_id = last_id = None
for page_num in itertools.count(1):
videos = list(self._playlist_entries(playlist))
if not videos:
return
start = next((i for i, v in enumerate(videos) if v['id'] == last_id), -1) + 1
if start >= len(videos):
return
for video in videos[start:]:
if video['id'] == first_id:
self.to_screen('First video %s found again; Assuming end of Mix' % first_id)
return
yield video
first_id = first_id or videos[0]['id']
last_id = videos[-1]['id']
_, data = self._extract_webpage(
'https://www.youtube.com/watch?list=%s&v=%s' % (playlist_id, last_id),
'%s page %d' % (playlist_id, page_num))
playlist = try_get(
data, lambda x: x['contents']['twoColumnWatchNextResults']['playlist']['playlist'], dict)
def _extract_from_playlist(self, item_id, url, data, playlist):
title = playlist.get('title') or try_get(
data, lambda x: x['titleText']['simpleText'], compat_str)
playlist_id = playlist.get('playlistId') or item_id
# Inline playlist rendition continuation does not always work
# at Youtube side, so delegating regular tab-based playlist URL
# processing whenever possible.
# Delegating everything except mix playlists to regular tab-based playlist URL
playlist_url = urljoin(url, try_get(
playlist, lambda x: x['endpoint']['commandMetadata']['webCommandMetadata']['url'],
compat_str))
@@ -2937,17 +2996,18 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
return self.url_result(
playlist_url, ie=YoutubeTabIE.ie_key(), video_id=playlist_id,
video_title=title)
return self.playlist_result(
self._playlist_entries(playlist), playlist_id=playlist_id,
playlist_title=title)
@staticmethod
def _extract_alerts(data):
return self.playlist_result(
self._extract_mix_playlist(playlist, playlist_id),
playlist_id=playlist_id, playlist_title=title)
def _extract_alerts(self, data, expected=False):
def _real_extract_alerts():
for alert_dict in try_get(data, lambda x: x['alerts'], list) or []:
if not isinstance(alert_dict, dict):
continue
for renderer in alert_dict:
alert = alert_dict[renderer]
for alert in alert_dict.values():
alert_type = alert.get('type')
if not alert_type:
continue
@@ -2959,6 +3019,18 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
if message:
yield alert_type, message
err_msg = None
for alert_type, alert_message in _real_extract_alerts():
if alert_type.lower() == 'error':
if err_msg:
self._downloader.report_warning('YouTube said: %s - %s' % ('ERROR', err_msg))
err_msg = alert_message
else:
self._downloader.report_warning('YouTube said: %s - %s' % (alert_type, alert_message))
if err_msg:
raise ExtractorError('YouTube said: %s' % err_msg, expected=expected)
def _extract_identity_token(self, webpage, item_id):
ytcfg = self._extract_ytcfg(item_id, webpage)
if ytcfg:
@@ -2969,64 +3041,90 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
r'\bID_TOKEN["\']\s*:\s*["\'](.+?)["\']', webpage,
'identity token', default=None)
@staticmethod
def _extract_account_syncid(data):
"""Extract syncId required to download private playlists of secondary channels"""
sync_ids = (
try_get(data, lambda x: x['responseContext']['mainAppWebResponseContext']['datasyncId'], compat_str)
or '').split("||")
if len(sync_ids) >= 2 and sync_ids[1]:
# datasyncid is of the form "channel_syncid||user_syncid" for secondary channel
# and just "user_syncid||" for primary channel. We only want the channel_syncid
return sync_ids[0]
def _extract_webpage(self, url, item_id):
retries = self._downloader.params.get('extractor_retries', 3)
count = -1
last_error = 'Incomplete yt initial data recieved'
while count < retries:
count += 1
# Sometimes youtube returns a webpage with incomplete ytInitialData
# See: https://github.com/yt-dlp/yt-dlp/issues/116
if count:
self.report_warning('%s. Retrying ...' % last_error)
webpage = self._download_webpage(
url, item_id,
'Downloading webpage%s' % (' (retry #%d)' % count if count else ''))
data = self._extract_yt_initial_data(item_id, webpage)
self._extract_alerts(data, expected=True)
if data.get('contents') or data.get('currentVideoEndpoint'):
break
if count >= retries:
self._downloader.report_error(last_error)
return webpage, data
def _real_extract(self, url):
item_id = self._match_id(url)
url = compat_urlparse.urlunparse(
compat_urlparse.urlparse(url)._replace(netloc='www.youtube.com'))
is_home = re.match(r'(?P<pre>%s)(?P<post>/?(?![^#?]).*$)' % self._VALID_URL, url)
if is_home is not None and is_home.group('not_channel') is None and item_id != 'feed':
# This is not matched in a channel page with a tab selected
mobj = re.match(r'(?P<pre>%s)(?P<post>/?(?![^#?]).*$)' % self._VALID_URL, url)
mobj = mobj.groupdict() if mobj else {}
if mobj and not mobj.get('not_channel'):
self._downloader.report_warning(
'A channel/user page was given. All the channel\'s videos will be downloaded. '
'To download only the videos in the home page, add a "/featured" to the URL')
url = '%s/videos%s' % (is_home.group('pre'), is_home.group('post') or '')
url = '%s/videos%s' % (mobj.get('pre'), mobj.get('post') or '')
# Handle both video/playlist URLs
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
video_id = qs.get('v', [None])[0]
playlist_id = qs.get('list', [None])[0]
if is_home is not None and is_home.group('not_channel') is not None and is_home.group('not_channel').startswith('watch') and not video_id:
if playlist_id:
self._downloader.report_warning('%s is not a valid Youtube URL. Trying to download playlist %s' % (url, playlist_id))
url = 'https://www.youtube.com/playlist?list=%s' % playlist_id
# return self.url_result(playlist_id, ie=YoutubePlaylistIE.ie_key())
else:
if not video_id and (mobj.get('not_channel') or '').startswith('watch'):
if not playlist_id:
# If there is neither video or playlist ids,
# youtube redirects to home page, which is undesirable
raise ExtractorError('Unable to recognize tab page')
self._downloader.report_warning('A video URL was given without video ID. Trying to download playlist %s' % playlist_id)
url = 'https://www.youtube.com/playlist?list=%s' % playlist_id
if video_id and playlist_id:
if self._downloader.params.get('noplaylist'):
self.to_screen('Downloading just video %s because of --no-playlist' % video_id)
return self.url_result(video_id, ie=YoutubeIE.ie_key(), video_id=video_id)
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
self.to_screen('Downloading playlist %s; add --no-playlist to just download video %s' % (playlist_id, video_id))
webpage, data = self._extract_webpage(url, item_id)
webpage = self._download_webpage(url, item_id)
identity_token = self._extract_identity_token(webpage, item_id)
data = self._extract_yt_initial_data(item_id, webpage)
err_msg = None
for alert_type, alert_message in self._extract_alerts(data):
if alert_type.lower() == 'error':
if err_msg:
self._downloader.report_warning('YouTube said: %s - %s' % ('ERROR', err_msg))
err_msg = alert_message
else:
self._downloader.report_warning('YouTube said: %s - %s' % (alert_type, alert_message))
if err_msg:
raise ExtractorError('YouTube said: %s' % err_msg, expected=True)
tabs = try_get(
data, lambda x: x['contents']['twoColumnBrowseResultsRenderer']['tabs'], list)
if tabs:
return self._extract_from_tabs(item_id, webpage, data, tabs, identity_token)
return self._extract_from_tabs(item_id, webpage, data, tabs)
playlist = try_get(
data, lambda x: x['contents']['twoColumnWatchNextResults']['playlist']['playlist'], dict)
if playlist:
return self._extract_from_playlist(item_id, url, data, playlist)
# Fallback to video extraction if no playlist alike page is recognized.
# First check for the current video then try the v attribute of URL query.
video_id = try_get(
data, lambda x: x['currentVideoEndpoint']['watchEndpoint']['videoId'],
compat_str) or video_id
if video_id:
self._downloader.report_warning('Unable to recognize playlist. Downloading just video %s' % video_id)
return self.url_result(video_id, ie=YoutubeIE.ie_key(), video_id=video_id)
# Failed to recognize
raise ExtractorError('Unable to recognize tab page')
@@ -3191,26 +3289,14 @@ class YoutubeSearchIE(SearchInfoExtractor, YoutubeBaseInfoExtractor):
_TESTS = []
def _entries(self, query, n):
data = {
'context': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20201021.03.00',
}
},
'query': query,
}
data = {'query': query}
if self._SEARCH_PARAMS:
data['params'] = self._SEARCH_PARAMS
total = 0
for page_num in itertools.count(1):
search = self._download_json(
'https://www.youtube.com/youtubei/v1/search?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8',
video_id='query "%s"' % query,
note='Downloading page %s' % page_num,
errnote='Unable to download API page', fatal=False,
data=json.dumps(data).encode('utf8'),
headers={'content-type': 'application/json'})
search = self._call_api(
ep='search', video_id='query "%s"' % query, fatal=False,
note='Downloading page %s' % page_num, query=data)
if not search:
break
slr_contents = try_get(
@@ -3302,7 +3388,6 @@ class YoutubeFeedsInfoExtractor(YoutubeTabIE):
Subclasses must define the _FEED_NAME property.
"""
_LOGIN_REQUIRED = True
# _MAX_PAGES = 5
_TESTS = []
@property
@@ -3362,8 +3447,8 @@ class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
class YoutubeHistoryIE(YoutubeFeedsInfoExtractor):
IE_DESC = 'Youtube watch history, ":ythistory" for short (requires authentication)'
_VALID_URL = r':ythistory'
IE_DESC = 'Youtube watch history, ":ythis" for short (requires authentication)'
_VALID_URL = r':ythis(?:tory)?'
_FEED_NAME = 'history'
_TESTS = [{
'url': ':ythistory',

View File

@@ -7,7 +7,9 @@ from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
merge_dicts,
NO_DEFAULT,
orderedSet,
parse_codecs,
@@ -21,61 +23,17 @@ from ..utils import (
class ZDFBaseIE(InfoExtractor):
def _call_api(self, url, player, referrer, video_id, item):
return self._download_json(
url, video_id, 'Downloading JSON %s' % item,
headers={
'Referer': referrer,
'Api-Auth': 'Bearer %s' % player['apiToken'],
})
def _extract_player(self, webpage, video_id, fatal=True):
return self._parse_json(
self._search_regex(
r'(?s)data-zdfplayer-jsb=(["\'])(?P<json>{.+?})\1', webpage,
'player JSON', default='{}' if not fatal else NO_DEFAULT,
group='json'),
video_id)
class ZDFIE(ZDFBaseIE):
IE_NAME = "ZDF-3sat"
_VALID_URL = r'https?://www\.(zdf|3sat)\.de/(?:[^/]+/)*(?P<id>[^/?]+)\.html'
_QUALITIES = ('auto', 'low', 'med', 'high', 'veryhigh', 'hd')
_GEO_COUNTRIES = ['DE']
_QUALITIES = ('auto', 'low', 'med', 'high', 'veryhigh', 'hd')
_TESTS = [{
'url': 'https://www.3sat.de/wissen/wissenschaftsdoku/luxusgut-lebensraum-100.html',
'info_dict': {
'id': 'luxusgut-lebensraum-100',
'ext': 'mp4',
'title': 'Luxusgut Lebensraum',
'description': 'md5:5c09b2f45ac3bc5233d1b50fc543d061',
'duration': 2601,
'timestamp': 1566497700,
'upload_date': '20190822',
}
}, {
'url': 'https://www.zdf.de/dokumentation/terra-x/die-magie-der-farben-von-koenigspurpur-und-jeansblau-100.html',
'info_dict': {
'id': 'die-magie-der-farben-von-koenigspurpur-und-jeansblau-100',
'ext': 'mp4',
'title': 'Die Magie der Farben (2/2)',
'description': 'md5:a89da10c928c6235401066b60a6d5c1a',
'duration': 2615,
'timestamp': 1465021200,
'upload_date': '20160604',
},
}, {
'url': 'https://www.zdf.de/service-und-hilfe/die-neue-zdf-mediathek/zdfmediathek-trailer-100.html',
'only_matching': True,
}, {
'url': 'https://www.zdf.de/filme/taunuskrimi/die-lebenden-und-die-toten-1---ein-taunuskrimi-100.html',
'only_matching': True,
}, {
'url': 'https://www.zdf.de/dokumentation/planet-e/planet-e-uebersichtsseite-weitere-dokumentationen-von-planet-e-100.html',
'only_matching': True,
}]
def _call_api(self, url, video_id, item, api_token=None, referrer=None):
headers = {}
if api_token:
headers['Api-Auth'] = 'Bearer %s' % api_token
if referrer:
headers['Referer'] = referrer
return self._download_json(
url, video_id, 'Downloading JSON %s' % item, headers=headers)
@staticmethod
def _extract_subtitles(src):
@@ -121,20 +79,11 @@ class ZDFIE(ZDFBaseIE):
})
formats.append(f)
def _extract_entry(self, url, player, content, video_id):
title = content.get('title') or content['teaserHeadline']
t = content['mainVideoContent']['http://zdf.de/rels/target']
ptmd_path = t.get('http://zdf.de/rels/streams/ptmd')
if not ptmd_path:
ptmd_path = t[
'http://zdf.de/rels/streams/ptmd-template'].replace(
'{playerId}', 'ngplayer_2_4')
def _extract_ptmd(self, ptmd_url, video_id, api_token, referrer):
ptmd = self._call_api(
urljoin(url, ptmd_path), player, url, video_id, 'metadata')
ptmd_url, video_id, 'metadata', api_token, referrer)
content_id = ptmd.get('basename') or ptmd_url.split('/')[-1]
formats = []
track_uris = set()
@@ -152,7 +101,7 @@ class ZDFIE(ZDFBaseIE):
continue
for track in tracks:
self._extract_format(
video_id, formats, track_uris, {
content_id, formats, track_uris, {
'url': track.get('uri'),
'type': f.get('type'),
'mimeType': f.get('mimeType'),
@@ -161,6 +110,103 @@ class ZDFIE(ZDFBaseIE):
})
self._sort_formats(formats)
duration = float_or_none(try_get(
ptmd, lambda x: x['attributes']['duration']['value']), scale=1000)
return {
'extractor_key': ZDFIE.ie_key(),
'id': content_id,
'duration': duration,
'formats': formats,
'subtitles': self._extract_subtitles(ptmd),
}
def _extract_player(self, webpage, video_id, fatal=True):
return self._parse_json(
self._search_regex(
r'(?s)data-zdfplayer-jsb=(["\'])(?P<json>{.+?})\1', webpage,
'player JSON', default='{}' if not fatal else NO_DEFAULT,
group='json'),
video_id)
class ZDFIE(ZDFBaseIE):
_VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
_TESTS = [{
# Same as https://www.phoenix.de/sendungen/ereignisse/corona-nachgehakt/wohin-fuehrt-der-protest-in-der-pandemie-a-2050630.html
'url': 'https://www.zdf.de/politik/phoenix-sendungen/wohin-fuehrt-der-protest-in-der-pandemie-100.html',
'md5': '34ec321e7eb34231fd88616c65c92db0',
'info_dict': {
'id': '210222_phx_nachgehakt_corona_protest',
'ext': 'mp4',
'title': 'Wohin führt der Protest in der Pandemie?',
'description': 'md5:7d643fe7f565e53a24aac036b2122fbd',
'duration': 1691,
'timestamp': 1613948400,
'upload_date': '20210221',
},
}, {
# Same as https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html
'url': 'https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html',
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
'info_dict': {
'id': '141007_ab18_10wochensommer_film',
'ext': 'mp4',
'title': 'Ab 18! - 10 Wochen Sommer',
'description': 'md5:8253f41dc99ce2c3ff892dac2d65fe26',
'duration': 2660,
'timestamp': 1608604200,
'upload_date': '20201222',
},
}, {
'url': 'https://www.zdf.de/dokumentation/terra-x/die-magie-der-farben-von-koenigspurpur-und-jeansblau-100.html',
'info_dict': {
'id': '151025_magie_farben2_tex',
'ext': 'mp4',
'title': 'Die Magie der Farben (2/2)',
'description': 'md5:a89da10c928c6235401066b60a6d5c1a',
'duration': 2615,
'timestamp': 1465021200,
'upload_date': '20160604',
},
}, {
# Same as https://www.phoenix.de/sendungen/dokumentationen/gesten-der-maechtigen-i-a-89468.html?ref=suche
'url': 'https://www.zdf.de/politik/phoenix-sendungen/die-gesten-der-maechtigen-100.html',
'only_matching': True,
}, {
# Same as https://www.3sat.de/film/spielfilm/der-hauptmann-100.html
'url': 'https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html',
'only_matching': True,
}, {
# Same as https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html, equal media ids
'url': 'https://www.zdf.de/wissen/nano/nano-21-mai-2019-102.html',
'only_matching': True,
}, {
'url': 'https://www.zdf.de/service-und-hilfe/die-neue-zdf-mediathek/zdfmediathek-trailer-100.html',
'only_matching': True,
}, {
'url': 'https://www.zdf.de/filme/taunuskrimi/die-lebenden-und-die-toten-1---ein-taunuskrimi-100.html',
'only_matching': True,
}, {
'url': 'https://www.zdf.de/dokumentation/planet-e/planet-e-uebersichtsseite-weitere-dokumentationen-von-planet-e-100.html',
'only_matching': True,
}]
def _extract_entry(self, url, player, content, video_id):
title = content.get('title') or content['teaserHeadline']
t = content['mainVideoContent']['http://zdf.de/rels/target']
ptmd_path = t.get('http://zdf.de/rels/streams/ptmd')
if not ptmd_path:
ptmd_path = t[
'http://zdf.de/rels/streams/ptmd-template'].replace(
'{playerId}', 'ngplayer_2_4')
info = self._extract_ptmd(
urljoin(url, ptmd_path), video_id, player['apiToken'], url)
thumbnails = []
layouts = try_get(
content, lambda x: x['teaserImageRef']['layouts'], dict)
@@ -181,33 +227,33 @@ class ZDFIE(ZDFBaseIE):
})
thumbnails.append(thumbnail)
return {
'id': video_id,
return merge_dicts(info, {
'title': title,
'description': content.get('leadParagraph') or content.get('teasertext'),
'duration': int_or_none(t.get('duration')),
'timestamp': unified_timestamp(content.get('editorialDate')),
'thumbnails': thumbnails,
'subtitles': self._extract_subtitles(ptmd),
'formats': formats,
}
})
def _extract_regular(self, url, player, video_id):
content = self._call_api(
player['content'], player, url, video_id, 'content')
player['content'], video_id, 'content', player['apiToken'], url)
return self._extract_entry(player['content'], player, content, video_id)
def _extract_mobile(self, video_id):
document = self._download_json(
video = self._download_json(
'https://zdf-cdn.live.cellular.de/mediathekV2/document/%s' % video_id,
video_id)['document']
video_id)
document = video['document']
title = document['titel']
content_id = document['basename']
formats = []
format_urls = set()
for f in document['formitaeten']:
self._extract_format(video_id, formats, format_urls, f)
self._extract_format(content_id, formats, format_urls, f)
self._sort_formats(formats)
thumbnails = []
@@ -225,12 +271,12 @@ class ZDFIE(ZDFBaseIE):
})
return {
'id': video_id,
'id': content_id,
'title': title,
'description': document.get('beschreibung'),
'duration': int_or_none(document.get('length')),
'timestamp': unified_timestamp(try_get(
document, lambda x: x['meta']['editorialDate'], compat_str)),
'timestamp': unified_timestamp(document.get('date')) or unified_timestamp(
try_get(video, lambda x: x['meta']['editorialDate'], compat_str)),
'thumbnails': thumbnails,
'subtitles': self._extract_subtitles(document),
'formats': formats,

View File

@@ -347,7 +347,7 @@ def parseOpts(overrideArguments=None):
'Specify any key (see "OUTPUT TEMPLATE" for a list of available keys) to '
'match if the key is present, '
'!key to check if the key is not present, '
'key>NUMBER (like "comment_count > 12", also works with '
'key>NUMBER (like "view_count > 12", also works with '
'>=, <, <=, !=, =) to compare against a number, '
'key = \'LITERAL\' (like "uploader = \'Mike Smith\'", also works with !=) '
'to match against a string literal '
@@ -369,7 +369,7 @@ def parseOpts(overrideArguments=None):
help='Download only the video, if the URL refers to a video and a playlist')
selection.add_option(
'--yes-playlist',
action='store_false', dest='noplaylist', default=False,
action='store_false', dest='noplaylist',
help='Download the playlist, if the URL refers to a video and a playlist')
selection.add_option(
'--age-limit',
@@ -634,16 +634,24 @@ def parseOpts(overrideArguments=None):
help='Use ffmpeg instead of the native HLS downloader')
downloader.add_option(
'--hls-use-mpegts',
dest='hls_use_mpegts', action='store_true',
dest='hls_use_mpegts', action='store_true', default=None,
help=(
'Use the mpegts container for HLS videos, allowing to play the '
'video while downloading (some players may not be able to play it)'))
'Use the mpegts container for HLS videos; '
'allowing some players to play the video while downloading, '
'and reducing the chance of file corruption if download is interrupted. '
'This is enabled by default for live streams'))
downloader.add_option(
'--no-hls-use-mpegts',
dest='hls_use_mpegts', action='store_false',
help=(
'Do not use the mpegts container for HLS videos. '
'This is default when not downloading live streams'))
downloader.add_option(
'--external-downloader',
dest='external_downloader', metavar='NAME',
help=(
'Use the specified external downloader. '
'Currently supports %s' % ', '.join(list_external_downloaders())))
'Name or path of the external downloader to use. '
'Currently supports %s (Recommended: aria2c)' % ', '.join(list_external_downloaders())))
downloader.add_option(
'--downloader-args', '--external-downloader-args',
metavar='NAME:ARGS', dest='external_downloader_args', default={}, type='str',
@@ -688,6 +696,10 @@ def parseOpts(overrideArguments=None):
'--bidi-workaround',
dest='bidi_workaround', action='store_true',
help='Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH')
workarounds.add_option(
'--sleep-requests', metavar='SECONDS',
dest='sleep_interval_requests', type=float,
help='Number of seconds to sleep between requests during data extraction')
workarounds.add_option(
'--sleep-interval', '--min-sleep-interval', metavar='SECONDS',
dest='sleep_interval', type=float,
@@ -706,7 +718,7 @@ def parseOpts(overrideArguments=None):
workarounds.add_option(
'--sleep-subtitles', metavar='SECONDS',
dest='sleep_interval_subtitles', default=0, type=int,
help='Enforce sleep interval on subtitles as well')
help='Number of seconds to sleep before each subtitle download')
verbosity = optparse.OptionGroup(parser, 'Verbosity and Simulation Options')
verbosity.add_option(
@@ -973,7 +985,9 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'--get-comments',
action='store_true', dest='getcomments', default=False,
help='Retrieve video comments to be placed in the .info.json file')
help=(
'Retrieve video comments to be placed in the .info.json file. '
'The comments are fetched even without this option if the extraction is known to be quick'))
filesystem.add_option(
'--load-info-json', '--load-info',
dest='load_info_filename', metavar='FILE',
@@ -1129,7 +1143,7 @@ def parseOpts(overrideArguments=None):
'Give field name to extract data from, and format of the field seperated by a ":". '
'Either regular expression with named capture groups or a '
'similar syntax to the output template can also be used. '
'The parsed parameters replace any existing values and can be use in output template'
'The parsed parameters replace any existing values and can be use in output template. '
'This option can be used multiple times. '
'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like '
'"Coldplay - Paradise". '
@@ -1140,7 +1154,7 @@ def parseOpts(overrideArguments=None):
help='Write metadata to the video file\'s xattrs (using dublin core and xdg standards)')
postproc.add_option(
'--fixup',
metavar='POLICY', dest='fixup', default='detect_or_warn',
metavar='POLICY', dest='fixup', default=None,
help=(
'Automatically correct known faults of the file. '
'One of never (do nothing), warn (only emit a warning), '
@@ -1204,6 +1218,10 @@ def parseOpts(overrideArguments=None):
help=optparse.SUPPRESS_HELP)
extractor = optparse.OptionGroup(parser, 'Extractor Options')
extractor.add_option(
'--extractor-retries',
dest='extractor_retries', metavar='RETRIES', default=3,
help='Number of retries for known extractor errors (default is %default), or "infinite"')
extractor.add_option(
'--allow-dynamic-mpd', '--no-ignore-dynamic-mpd',
action='store_true', dest='dynamic_mpd', default=True,

View File

@@ -193,4 +193,6 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
info['__thumbnail_filename'], os.path.splitext(original_thumbnail)[1][1:])
if original_thumbnail == thumbnail_filename:
files_to_delete = []
elif original_thumbnail != thumbnail_filename:
files_to_delete.append(original_thumbnail)
return files_to_delete, info

View File

@@ -49,12 +49,16 @@ def update_self(to_screen, verbose, opener):
h.update(mv[:n])
return h.hexdigest()
to_screen('Current Build Hash %s' % calc_sha256sum(sys.executable))
if not isinstance(globals().get('__loader__'), zipimporter) and not hasattr(sys, 'frozen'):
to_screen('It looks like you installed yt-dlp with a package manager, pip, setup.py or a tarball. Please use that to update.')
return
# sys.executable is set to the full pathname of the exe-file for py2exe
# though symlinks are not followed so that we need to do this manually
# with help of realpath
filename = compat_realpath(sys.executable if hasattr(sys, 'frozen') else sys.argv[0])
to_screen('Current Build Hash %s' % calc_sha256sum(filename))
# Download and check versions info
try:
version_info = opener.open(JSON_URL).read().decode('utf-8')
@@ -103,11 +107,6 @@ def update_self(to_screen, verbose, opener):
(i[1] for i in hashes if i[0] == 'yt-dlp%s' % label),
None)
# sys.executable is set to the full pathname of the exe-file for py2exe
# though symlinks are not followed so that we need to do this manually
# with help of realpath
filename = compat_realpath(sys.executable if hasattr(sys, 'frozen') else sys.argv[0])
if not os.access(filename, os.W_OK):
to_screen('ERROR: no write permissions on %s' % filename)
return
@@ -198,28 +197,18 @@ def update_self(to_screen, verbose, opener):
to_screen('Visit https://github.com/yt-dlp/yt-dlp/releases/latest')
return
expected_sum = get_sha256sum('zip', py_ver)
if expected_sum and hashlib.sha256(newcontent).hexdigest() != expected_sum:
to_screen('ERROR: unable to verify the new zip')
to_screen('Visit https://github.com/yt-dlp/yt-dlp/releases/latest')
return
try:
with open(filename + '.new', 'wb') as outf:
with open(filename, 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError):
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: unable to write the new version')
return
expected_sum = get_sha256sum('zip', py_ver)
if expected_sum and calc_sha256sum(filename + '.new') != expected_sum:
to_screen('ERROR: unable to verify the new zip')
to_screen('Visit https://github.com/yt-dlp/yt-dlp/releases/latest')
try:
os.remove(filename + '.new')
except OSError:
to_screen('ERROR: unable to remove corrupt zip')
return
try:
os.rename(filename + '.new', filename)
except OSError:
to_screen('ERROR: unable to overwrite current version')
return

View File

@@ -5945,9 +5945,13 @@ def make_dir(path, to_screen=None):
def get_executable_path():
path = os.path.dirname(sys.argv[0])
if os.path.basename(sys.argv[0]) == '__main__': # Running from source
path = os.path.join(path, '..')
from zipimport import zipimporter
if hasattr(sys, 'frozen'): # Running from PyInstaller
path = os.path.dirname(sys.executable)
elif isinstance(globals().get('__loader__'), zipimporter): # Running from ZIP
path = os.path.join(os.path.dirname(__file__), '../..')
else:
path = os.path.join(os.path.dirname(__file__), '..')
return os.path.abspath(path)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2021.02.19'
__version__ = '2021.03.03.2'