Compare commits

..

162 Commits

Author SHA1 Message Date
github-actions[bot]
5945fc1945 Release 2024.09.27
Created by: bashonly

:ci skip all
2024-09-27 23:01:13 +00:00
bashonly
c6387abc1a [cleanup] Misc (#10807)
Closes #10751, Closes #10769, Closes #10791
Authored by: bashonly, Codenade, pzhlkj6612, seproDev, coletdjnz, grqz, Grub4K

Co-authored-by: Codenade <amadeus.dorian04@gmail.com>
Co-authored-by: Mozi <29089388+pzhlkj6612@users.noreply.github.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
Co-authored-by: coletdjnz <coletdjnz@protonmail.com>
Co-authored-by: N/Ame <173015200+grqz@users.noreply.github.com>
Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2024-09-27 22:46:22 +00:00
bashonly
cca534cd9e Raise minimum recommended Python version to 3.9 (#11098)
Authored by: bashonly
2024-09-27 22:30:31 +00:00
kclauhk
7509d692b3 [ie/loom] Fix m3u8 formats extraction (#10760)
Closes #10737
Authored by: kclauhk
2024-09-27 22:28:22 +00:00
ndyanx
63da31b3b2 [ie/dropbox] Fix password-protected video support (#10735)
Also adds thumbnail extraction

Closes #9864
Authored by: ndyanx
2024-09-27 22:05:22 +00:00
rakslice
8f4ea14680 Fix format sorting bug with vp9.2 vcodec (#10884)
Authored by: rakslice
2024-09-27 21:32:39 +00:00
fireattack
a1b4ac2b8e [ie/vimeo] Fix HLS audio format sorting (#11082)
Closes #10854
Authored by: fireattack
2024-09-27 20:57:57 +00:00
Kieran
c08e0b20b5 Allow none arg to negate --convert-subs and --convert-thumbnails (#11066)
Authored by: kieraneglin
2024-09-27 20:52:41 +00:00
bashonly
0aa4426e9a [ie/kick:clips] Support new URL format (#11107)
Closes #11105
Authored by: bashonly
2024-09-27 16:38:40 +00:00
bashonly
48d629d461 [ie/YleAreena] Support podcasts (#11104)
Closes #10840
Authored by: bashonly
2024-09-27 16:38:08 +00:00
bashonly
7f909046f4 [ie/abc.net.au:iview:showseries] Fix extraction (#11101)
Closes #10475
Authored by: bashonly
2024-09-27 16:37:16 +00:00
bashonly
eabb4680fd [ie/niconico] Fix m3u8 formats extraction (#11103)
Closes #10724
Authored by: bashonly
2024-09-26 23:27:16 +00:00
bashonly
1d84b780cf [ie/youtube:clip] Prioritize https formats (#11102)
Closes #10856
Authored by: bashonly
2024-09-26 23:26:10 +00:00
bashonly
9f5c9a9089 [ie/wistia] Support password-protected videos (#11100)
Closes #10914
Authored by: bashonly
2024-09-26 23:21:03 +00:00
bashonly
a2000bc857 [ie/bilibili] Fix chapters and subtitles extraction (#11099)
Closes #11089
Authored by: bashonly
2024-09-26 23:20:14 +00:00
diman8
5a8a05aebb [ie/SVTPage] Fix extractor (#11010)
Authored by: diman8
2024-09-26 16:57:00 +00:00
tony-hn
ad0b857f45 [ie/RumbleChannel] Fix extractor (#11049)
Closes #10833
Authored by: tony-hn
2024-09-26 16:53:52 +00:00
N/Ame
124f058b54 [ie/Germanupa] Add extractor (#10538)
Closes #10527
Authored by: grqz
2024-09-26 18:39:48 +02:00
sepro
416686ed0c [ie/ertgr] Fix video extraction (#11091)
Closes #8955
Authored by: seproDev
2024-09-26 18:35:19 +02:00
sepro
b37417e4f9 [ie/SnapchatSpotlight] Add extractor (#11030)
Closes #1797
Authored by: seproDev
2024-09-26 18:32:51 +02:00
Mozi
28b0ecba2a [ie/Mojevideo] Add extractor (#11019)
Closes #8159
Authored by: 04-pasha-04, pzhlkj6612

Co-authored-by: pasha <pasha.syd04@gmail.com>
2024-09-26 18:29:21 +02:00
szantnerb
e2b3634e29 [ie/mediaklikk] Fix extractor (#11083)
Closes #11061
Authored by: szantnerb
2024-09-26 18:23:26 +02:00
bashonly
fb8b7f226d [build] Bump PyInstaller version pin to >=6.10.0 (#10709)
Authored by: bashonly
2024-09-25 23:07:17 +00:00
sepro
b397a64691 [cookies] Improve error message for Windows --cookies-from-browser chrome issue (#11090)
Authored by: seproDev
2024-09-25 23:13:54 +02:00
bashonly
5bb1aa04da [networking] Pin curl-cffi version to < 0.7.2 (#11092)
Ref: https://github.com/lexiforest/curl_cffi/issues/394

Authored by: bashonly
2024-09-25 20:59:20 +00:00
bashonly
fa2be9a7c6 [ie/youtube] Fix format_note (Bugfix for 3a3bd00037) (#11028)
Authored by: bashonly
2024-09-24 22:12:02 +00:00
bashonly
3ad0b7f422 [ie/tiktok] Fix web formats extraction (#11074)
Closes #11034
Authored by: bashonly
2024-09-24 22:10:42 +00:00
1-Byte
4a9bc8c363 [ie/NZZ] Fix extractor (#10461)
Closes #5653
Authored by: 1-Byte
2024-09-17 21:17:05 +02:00
Khaoklong51
a06bb58679 [ie/BiliIntl] Fix referer header (#11003)
Closes #10996
Authored by: Khaoklong51
2024-09-14 16:19:17 +00:00
bashonly
a555389c9b [ie/HGTVDe] Fix extractor (#10992)
Closes #10984
Authored by: bashonly, rdamas

Co-authored-by: Robert Damas <robert.damas@byom.de>
2024-09-14 00:23:22 +00:00
N/Ame
173d54c151 [ie/kick:vod] Support new URL format (#10988)
Closes #10975
Authored by: grqz, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-09-14 00:21:07 +00:00
Oto Valek
4a27b8f092 [ie/IPrima] Fix zoom URL support (#10959)
Closes #6100
Authored by: otovalek
2024-09-14 00:19:03 +00:00
sepro
41a241ca6f [ie/Sen] Add extractor (#10952)
Closes #10951
Authored by: seproDev
2024-09-14 00:16:34 +00:00
sepro
3aa0156e05 [ie/Xinpianchang] Fix extractor (#10950)
Authored by: seproDev
2024-09-14 00:15:07 +00:00
sepro
300c91274f [ie/Servus] Fix extractor (#10944)
Closes #10941
Authored by: seproDev
2024-09-14 00:14:09 +00:00
aarubui
d8d473002b [ie/tenplay] Fix extractor (#10928)
Closes #10926
Authored by: aarubui
2024-09-14 00:09:15 +00:00
naglis
36f9e602ad [ie/screenrec] Add extractor (#10917)
Closes #9780
Authored by: naglis
2024-09-13 23:27:10 +00:00
ischmidt20
7adff8caf1 [ie/WatchESPN] Improve auth support (#10910)
Authored by: ischmidt20
2024-09-13 23:25:12 +00:00
naglis
fa83d0b36b [ie/LnkGo] Remove extractor (#10904)
Authored by: naglis
2024-09-13 23:23:19 +00:00
Sahil Singh
c8c078fe28 [ie/pinterest] Extend _VALID_URL (#10867)
Closes #10850
Authored by: sahilsinghss73, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-09-13 23:22:14 +00:00
bashonly
325001317d [ie] Handle decode errors when reading responses (#10868)
Authored by: bashonly
2024-09-13 23:20:17 +00:00
bashonly
cc85596d5b [utils] mimetype2ext: Recognize aacp as aac (#10860)
Authored by: bashonly
2024-09-13 23:19:18 +00:00
Leng
0e1b941c6b [ie/facebook:reel] Improve metadata extraction
Closes #9057, Closes #10824
Authored by: lengzuo
2024-09-13 23:18:13 +00:00
Xingchen Song(宋星辰)
3dfd720d09 [ie/ximalaya] Add VIP support (#10832)
Closes #6928
Authored by: xingchensong, seproDev

Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
2024-09-13 23:16:34 +00:00
hugepower
25c1cdaa26 [ie/huya:video] Add extractor (#10686)
Closes #10679
Authored by: hugepower
2024-09-13 23:12:38 +00:00
Cosmin Tanislav
d02df303d8 [ie/RTP] Support more subpages (#10787)
Authored by: Demon000
2024-09-13 23:09:52 +00:00
Scott Robinson
5d0176547f [ie/Bandcamp:user] Fix extraction (#10328)
Authored by: quad, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-09-13 23:02:54 +00:00
sepro
409f8e9e3b [ie] Fix JW Player format parsing (#10956)
Authored by: seproDev
2024-09-13 22:54:41 +00:00
Deukhoofd
b4760c778d [ie/beacon] Add extractor (#9901)
Authored by: Deukhoofd
2024-09-13 22:50:15 +00:00
sepro
9431777b4c [ie/youtube:tab] Fix shorts tab extraction (#10938)
Closes #10936
Authored by: seproDev
2024-09-13 22:46:44 +00:00
sepro
3a3bd00037 [ie/youtube] Add po_token, visitor_data, data_sync_id extractor args (#10648)
Authored by:  seproDev, coletdjnz, bashonly
2024-09-13 22:51:58 +12:00
coletdjnz
d1c4d88b2d [networking] Fix handler not being added to RequestError (#10955)
Authored by: coletdjnz
2024-09-08 19:32:44 +12:00
sepro
46f4c80bc3 [ie/SampleFocus] Fix extractor (#10947)
Closes #10945
Authored by: seproDev
2024-09-07 17:06:12 +02:00
sepro
0fba08485b [ie/khanacademy] Fix extractor (#10913)
Closes #10912
Authored by: seproDev
2024-09-05 20:47:14 +02:00
Simon Sawicki
b6200bdcf3 [ci] Add comment sanitization workflow (#10915)
Co-authored-by: bashonly <bashonly@protonmail.com>
Authored by: bashonly, Grub4K
2024-09-05 20:06:15 +02:00
sepro
e8e6a982a1 [ie/vimeo] Fix login detection (bugfix for 4115c24d15) (#10906)
Authored by: seproDev
2024-09-02 21:20:37 +02:00
bashonly
7e41628ff5 [build] Pin delocate version for macos (#10901)
Authored by: bashonly
2024-09-01 23:56:50 +00:00
Frank Aurich
e6f48ca808 [ie/KiKA] Add extractor (#5788)
Authored by: 1100101
2024-09-02 01:28:51 +02:00
bashonly
4115c24d15 [ie/vimeo] Always try to extract original format (#10721)
Closes #9163
Authored by: bashonly
2024-09-01 23:25:36 +00:00
bashonly
ad9a8115aa [ci] Add issue tracker anti-spam protection (#10861)
Authored by: bashonly
2024-08-28 08:01:51 +00:00
Mozi
41be32e78c [ie/Rutube] Support livestreams (#10844)
Closes #4418, Closes #4594
Authored by: pzhlkj6612
2024-08-26 23:17:25 +02:00
Mozi
e978c312d6 [ie/Vidflex] Add extractor (#10002)
Closes #1377
Authored by: pzhlkj6612
2024-08-26 22:56:36 +02:00
coletdjnz
6f9e653743 [rh:websockets] Upgrade websockets to 13.0 (#10815)
Fixes CI hanging

Authored by: coletdjnz
2024-08-21 19:17:26 +12:00
sepro
f0bb28504c [ie/Eurosport] Support local URL variants (#10785)
Authored by: seproDev
2024-08-20 00:12:42 +02:00
bashonly
bef1d4d6fc [ie/twitter:spaces] Support video spaces (#10789)
Authored by: bashonly
2024-08-19 15:38:19 +00:00
garret1317
c8d096c5ce [ie/radiko] Extract unique id values (#10726)
Authored by: garret1317
2024-08-19 15:22:19 +00:00
Mozi
a7d3235c84 [ie/asobistage] Support redirected URLs (#10768)
Authored by: pzhlkj6612
2024-08-18 16:50:06 +00:00
kclauhk
d62fef7e07 [ie/facebook:ads] Fix extractor (#10704)
Closes #10701
Authored by: kclauhk
2024-08-15 19:53:37 +00:00
Christopher Schreiner
cc88a54bb1 [ie/adn] Fix extractors (#10749)
Closes #10748
Authored by: infanf
2024-08-15 19:50:08 +00:00
N/Ame
b43bd86485 [ie/bilibili] Fix festival URL support (#10740)
Closes #10739
Authored by: grqz, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-08-15 19:33:41 +00:00
Hank Brown
232e6db30c [ie/PatreonCampaign] Support API URLs (#10734)
Closes #10733
Authored by: hibes, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-08-13 23:26:55 +00:00
bashonly
49f3741a82 [ie/youtube] Support excluding player_clients in extractor-arg (#10710)
Closes #10699
Authored by: bashonly
2024-08-12 09:12:46 +00:00
github-actions[bot]
a065086640 Release 2024.08.06
Created by: bashonly

:ci skip all :ci run dl
2024-08-06 03:03:12 +00:00
bashonly
4d92312083 [ie/niconico] Fix extractor (#10677)
Closes #10662
Authored by: bashonly
2024-08-06 02:50:06 +00:00
scribblemaniac
fc5eecfa31 [ie/gem.cbc.ca:live] Fix extractor (#10565)
Authored by: scribblemaniac, bashonly
2024-08-06 01:02:21 +00:00
bashonly
406f4c2e47 [ie/youtube] Change default player clients to ios,web_creator (#10674)
Closes #10660
Authored by: bashonly
2024-08-05 23:26:50 +00:00
sepro
c86891eb94 [ie/youtube] Fix n function name extraction for player b12cc44b (#10668)
Authored by: seproDev
2024-08-05 20:36:11 +00:00
sepro
bb8bf1db99 [jsinterp] Improve slice implementation (#10664)
Authored by: seproDev
2024-08-05 20:28:24 +00:00
bashonly
e7d73bc453 [ie/DiscoveryPlusItaly] Support sport and olympics URLs (#10655)
Closes #10654
Authored by: bashonly
2024-08-04 15:20:45 +00:00
bashonly
919540a964 [ie/olympics] Fix extraction (#10625)
Bugfix for 2f1ddfe12a

Closes #10592
Authored by: bashonly
2024-08-01 20:25:46 +00:00
hugepower
0088c6de23 [ie/youku] Fix extractor (#10626)
Closes #10549
Authored by: hugepower
2024-08-01 16:40:46 +00:00
github-actions[bot]
abe10131fc Release 2024.08.01
Created by: bashonly

:ci skip all :ci run dl
2024-08-01 15:11:19 +00:00
bashonly
ffd7781d65 [cleanup] Misc (#10623)
Authored by: bashonly
2024-08-01 15:03:49 +00:00
sepro
efb42763de [ie/youtube] Change default player clients to ios,tv (#10457)
Closes #10046
Authored by: seproDev
2024-08-01 14:03:03 +00:00
Oğulcan Tokar
bb3936ae2b [ie/kick:clips] Add extractor (#10572)
Closes #8115
Authored by: luvyana
2024-08-01 00:00:52 +00:00
bashonly
d19fcb9342 [ie/youtube] Fix age-verification workaround (#10610)
Authored by: bashonly, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2024-07-31 21:39:36 +00:00
bashonly
011b4a04db [ie/youtube] Fix n function name extraction for player 20dfca59 (#10611)
Closes #10608
Authored by: bashonly
2024-07-31 21:19:30 +00:00
szantnerb
7e3e4779ad [ie/mediaklikk] Fix extractor (#10605)
Closes #10588
Authored by: szantnerb
2024-07-31 02:22:44 +00:00
vvto33
5260696b1c [ie/tver] Support olympic URLs (#10600)
Closes #10583
Authored by: vvto33
2024-07-31 02:18:43 +00:00
bashonly
2f1ddfe12a [ie/olympics] Fix extractor (#10604)
Closes #10592
Authored by: bashonly
2024-07-31 01:50:20 +00:00
bashonly
4b69e1b53e [ie/mlbtv] Fix makeup game extraction (#10607)
Closes #10606
Authored by: bashonly
2024-07-30 23:17:05 +00:00
bashonly
0e539617a4 [ie/youtube] Player client maintenance (#10573)
- Add clients: android_producer, android_testsuite, android_vr, tv, web_safari
- Remove obsolete clients: android_embedded, ios_embedded, *_embedscreen

Authored by: bashonly
2024-07-30 21:27:06 +00:00
bashonly
fe15d3178e [ie/learningonscreen] Add extractor (#10590)
Authored by: bashonly, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2024-07-30 09:09:55 +00:00
trainman261
94a1c5e642 [ie/cbc.ca:player] Fix extractor (#10302)
Closes #10170
Authored by: trainman261, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-07-29 21:58:26 +00:00
bashonly
2b6df93a24 [ie/vimeo:review] Fix password-protected video extraction (#10598)
Closes #10255
Authored by: bashonly
2024-07-29 21:55:06 +00:00
middlingphys
ef36d517f9 [ie/abematv] Fix availability extraction (#10569)
Authored by: middlingphys
2024-07-29 05:54:59 +00:00
bashonly
6daf2c27c0 [utils] unified_timestamp: Recognize Sunday (#10589)
Authored by: bashonly
2024-07-29 05:35:46 +00:00
bashonly
28d485714f [ie/tva] Fix extractor (#10567)
Closes #10555
Authored by: bashonly
2024-07-25 22:30:00 +00:00
bashonly
0b77286184 [ie/DiscoveryPlus] Support olympics URLs (#10566)
Closes #10564
Authored by: bashonly
2024-07-25 22:00:58 +00:00
github-actions[bot]
6b1e430d8e Release 2024.07.25
Created by: bashonly

:ci skip all :ci run dl
2024-07-25 03:29:27 +00:00
bashonly
f0993391e6 [ie/mlbtv] Fix extractor (#10515)
Closes #10510
Authored by: bashonly
2024-07-24 21:22:55 +00:00
bashonly
1a34a802f4 [ie/facebook] Fix extraction (#10531)
Closes #10532
Authored by: bashonly
2024-07-23 23:08:24 +00:00
bashonly
a0a1bc3d8d [ie/vimeo] Fix chapters extraction (#10544)
Closes #5308
Authored by: bashonly
2024-07-23 22:00:57 +00:00
bashonly
2f97779f33 [ie/tiktok] Fix and deprioritize JSON subtitles (#10516)
Fixes regression caused by 5ce582448e

Closes #10514
Authored by: bashonly
2024-07-23 21:49:31 +00:00
bashonly
713b4cd18f [ie/youtube] Fix n function name extraction for player 3400486c (#10542)
Authored by: bashonly
2024-07-23 21:25:49 +00:00
bashonly
a3bab4752a [ie/abematv] Adapt key retrieval to request handler framework (#10491)
Fixes a regression caused by a dependence on buggy behavior that was corrected in 150ecc45d9

Closes #10489
Authored by: bashonly
2024-07-18 20:43:31 +00:00
bashonly
e046db8a11 [build] Pin setuptools version (#10493)
https://github.com/pypa/setuptools/issues/4480#issuecomment-2236507819
https://github.com/pypa/setuptools/issues/4482

Authored by: bashonly
2024-07-18 20:33:28 +00:00
github-actions[bot]
37c233562d Release 2024.07.16
Created by: bashonly

:ci skip all :ci run dl
2024-07-16 22:08:42 +00:00
bashonly
89a161e8c6 [cleanup] Misc (#10487)
Closes #10483
Authored by: bashonly
2024-07-16 22:01:01 +00:00
bashonly
ed1b9ed93d [update] Fix network error handling (#10486)
Authored by: bashonly
2024-07-16 21:10:50 +00:00
Simon Sawicki
d9cbced493 [core] Support auto-tty and no_color-tty for --color (#10453)
Authored by: Grub4K
2024-07-16 21:51:56 +02:00
Simon Sawicki
66ce3d76d8 [core] Fix noprogress if test=True with --quiet and --verbose (#10454)
Authored by: Grub4K
2024-07-16 21:48:45 +02:00
bashonly
39e6c4cb44 [ie/dplay] Fix extractors (#10471)
Closes #1623, Closes #2138, Closes #2361, Closes #3841, Closes #8026, Closes #10421
Authored by: bashonly
2024-07-15 22:30:43 +00:00
bashonly
e62fa6b0e0 [ie/digitalconcerthall] Extract HEVC and FLAC formats (#10470)
Authored by: bashonly
2024-07-14 22:56:28 -05:00
bashonly
cc0070f649 [utils] parse_codecs: Fix parsing of mixed case codec strings
Authored by: bashonly
2024-07-14 22:56:28 -05:00
sepro
b85eef0a61 [ie/youtube] Reduce android client priority (#10467)
Authored by: seproDev
2024-07-14 21:10:29 +02:00
DunnesH
22870b81ba [ie/soundcloud:user:permalink] Extract tracks only (#10463)
Closes #10242
Authored by: DunnesH
2024-07-14 19:01:50 +00:00
bashonly
b9afb99e7c [ie/generic] Fix direct video link extensions (#10468)
Fixes regression in the generic extractor due in 5ce582448e

Closes #10459
Authored by: bashonly
2024-07-14 18:57:07 +00:00
sepro
16da8ef993 [ie/youtube] Fix initial player response usage (Bugfix for 8b8b442cb0) (#10464)
Authored by: seproDev
2024-07-14 20:42:11 +02:00
Christopher Schreiner
959b7a379b [ie/adn] Adjust for .com domain change (#10399)
Closes #10442
Authored by: infanf
2024-07-14 15:58:05 +00:00
Ian Comaya
8531d2b03b [ie/EpidemicSound] Support sound effects URLs (#10436)
Closes #10435
Authored by: iancmy
2024-07-14 04:52:50 +00:00
bashonly
4cd4146924 [ie/afreecatv] Fix login and use legacy_ssl (#10440)
Fixes regression in e8352ad659 due to cookies bug in curl_cffi < 0.7.1

Closes #10438
Authored by: bashonly
2024-07-14 01:09:00 +00:00
Franklin Lee
bacd18b7df [ie/picarto] Fix extractors (#10414)
Closes #10413
Authored by: Frankgoji
2024-07-14 00:16:18 +00:00
coletdjnz
150ecc45d9 [networking] Add legacy_ssl request extension (#10448)
Supported by Urllib, Requests and Websockets request handlers. Ignored by CurlCFFI.

Also added couple cookie-related tests.

Authored by: coletdjnz
2024-07-14 11:22:43 +12:00
sepro
8b8b442cb0 [ie/youtube] Avoid poToken experiment player responses (#10456)
Closes #10397
Authored by: seproDev
2024-07-14 01:19:17 +02:00
bashonly
644d84d778 Revert 4f8448896e
curl-cffi 0.5.10 does not support Windows 32-bit

Authored by: bashonly
2024-07-12 14:34:19 -05:00
bashonly
ac30941ae6 [build] Pin curl-cffi to 0.5.10 for Windows
Ref: https://github.com/yifeikong/curl-impersonate/issues/72

Closes #10426
Authored by: bashonly
2024-07-12 14:34:19 -05:00
bashonly
cc1a3098c0 [ie/tv5monde] Fix impersonation (Bugfix for 9b95a6765a) (#10430)
Authored by: bashonly
2024-07-11 17:22:37 +00:00
sepro
705f5b84de [ie/box] Support enterprise URLs (#10419)
Closes #10418
Authored by: seproDev
2024-07-10 21:48:50 +02:00
bashonly
9b95a6765a [ie/tv5monde] Support browser impersonation (#10417)
Closes #10153
Authored by: bashonly
2024-07-10 15:13:47 +00:00
bashonly
4f8448896e [build] Include curl_cffi in yt-dlp_x86.exe
Authored by: bashonly
2024-07-09 18:36:15 -05:00
bashonly
4521f30d14 [build] Include curl_cffi in yt-dlp_linux
Authored by: bashonly
2024-07-09 18:36:15 -05:00
coletdjnz
42bfca00a6 [rh:curl_cffi] Support curl_cffi 0.7.X
Authored by: coletdjnz
2024-07-09 18:36:15 -05:00
mokrueger
d2189d3d36 [ie/tiktok:live] Fix room ID extraction (#10408)
Closes #10407
Authored by: mokrueger
2024-07-09 23:27:01 +00:00
bashonly
04e17ba20a [ie/youtube] Invalidate nsig cache from < 2024.07.09 (#10401)
Versions after 297b0a3792 and before 7ead7332af may have cached incorrect nsig function data

Authored by: bashonly
2024-07-09 19:04:46 +00:00
github-actions[bot]
bbf84bf55e Release 2024.07.09
Created by: seproDev

:ci skip all :ci run dl
2024-07-09 01:51:07 +00:00
sepro
7ead7332af [ie/youtube] Remove broken n function extraction fallback (#10396)
Closes #10391
Authored by: pukkandan, seproDev

Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
2024-07-09 03:45:14 +02:00
sepro
0b570f2a90 [core] Do not alter default format selection when simulated (#9862)
Closes #9843
Authored by: seproDev
2024-07-09 01:51:43 +02:00
github-actions[bot]
1a6ac547ea Release 2024.07.08
Created by: bashonly

:ci skip all :ci run dl
2024-07-08 22:19:18 +00:00
bashonly
4b50b292cc [ie/soundcloud] Fix rate-limit handling (#10389)
Authored by: bashonly
2024-07-08 22:09:08 +00:00
bashonly
297b0a3792 [ie/youtube] Fix JS n function name extraction (#10390)
Fixes nsig decoding for player b22ef6e7

Closes #10391
Authored by: bashonly, seproDev

Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
2024-07-08 22:04:48 +00:00
Simon Sawicki
6c056ea7ae [jsinterp] Implement Function.prototype resolving for call and apply (#10392)
Authored by: Grub4K
2024-07-08 23:46:26 +02:00
github-actions[bot]
39bc699d2e Release 2024.07.07
Created by: bashonly

:ci skip all :ci run dl
2024-07-07 21:35:02 +00:00
bashonly
b337d2989c [cleanup] Misc (#10383)
Authored by: bashonly
2024-07-07 21:23:40 +00:00
Hardik Bhimani
f0f867f008 [ie/jiosaavn:playlist] Support featured playlists (#10382)
Closes #10369
Authored by: harbhim
2024-07-07 21:08:25 +00:00
DinhHuy2010
987a1f94c2 [ie/vtv] Add extractors (#10173)
Authored by: DinhHuy2010
2024-07-07 21:59:42 +02:00
sepro
4cdc976bd8 [ie/yle_areena] Fix metadata extraction (#10380)
Authored by: seproDev
2024-07-07 21:57:18 +02:00
Simon Sawicki
0d174e8bed [ie/yle_areena] Fix subtitle extraction (#10379)
Authored by: Grub4K
2024-07-07 21:21:00 +02:00
Dong Heon Hee
4862a29854 [ie/chzzk] Extract with API v3 (#10363)
Authored by: hui1601
2024-07-06 03:32:08 +00:00
bashonly
2469119490 [core] Address gaps in allowed extensions (#10362)
Adds some extensions missing in 5ce582448e

Closes #10360, Closes #10365
Authored by: bashonly
2024-07-05 23:17:47 +00:00
Sean Ellingham
00766ece0c [ie/vidyard] Add extractor (#10155)
Closes #4618
Authored by: exterrestris
2024-07-05 23:02:35 +00:00
middlingphys
2a1a1b8e67 [ie/abematv] Extract availability (#10348)
Authored by: middlingphys
2024-07-05 22:31:16 +00:00
bashonly
c1c9bb4adb [ie/vimeo] Fix password-protected video extraction (#10341)
Closes #6603
Authored by: bashonly
2024-07-05 18:32:53 +00:00
Thomas Gerbet
6075a029db [ie/douyutv] Do not use dangerous javascript source/URL (#10347)
Ref: https://sansec.io/research/polyfill-supply-chain-attack

Authored by: LeSuisse
2024-07-03 22:35:24 +00:00
bashonly
cc767e9490 [core] Fix --ignore-no-formats-error (#10345)
Fixes regression in 5ce582448e

Closes #10344
Authored by: Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2024-07-03 16:46:01 +00:00
github-actions[bot]
d28aa87e21 Release 2024.07.02
Created by: bashonly

:ci skip all :ci run dl
2024-07-02 23:13:48 +00:00
bashonly
93d33cb29a [cleanup] Misc (#10330)
Authored by: bashonly
2024-07-02 23:03:08 +00:00
Mozi
7799e51895 [ie/zaiko] Support JWT video URLs (#10130)
Closes #9798
Authored by: pzhlkj6612
2024-07-02 22:22:52 +00:00
Patryk Miś
7509791385 [ie/banbye] Fix extractor (#10332)
Closes #8584
Authored by: PatrykMis, seproDev

Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
2024-07-02 21:51:07 +00:00
DrakoCpp
6403530e2d [ie/murrtube] Fix extractor (#9249)
Closes #7500
Authored by: DrakoCpp
2024-07-02 21:49:09 +00:00
bashonly
d502f4c6d9 [pp/embedthumbnail] Fix embedding with mutagen (#10337)
Fixes regression in f2a4ea1794

Closes #10335
Authored by: bashonly
2024-07-02 21:24:17 +00:00
bashonly
773bbb1815 [core] Fix --compat-opt allow-unsafe-ext (#10336)
Fixes bug in 5ce582448e

Authored by: bashonly, rdamas

Co-authored-by: Robert Damas <robert.damas@byom.de>
2024-07-02 21:17:06 +00:00
144 changed files with 5274 additions and 1917 deletions

View File

@@ -77,3 +77,11 @@ body:
render: shell
validations:
required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -89,3 +89,11 @@ body:
render: shell
validations:
required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -85,3 +85,11 @@ body:
render: shell
validations:
required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -70,3 +70,11 @@ body:
render: shell
validations:
required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -64,3 +64,11 @@ body:
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
<more lines>
render: shell
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -70,3 +70,11 @@ body:
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
<more lines>
render: shell
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -266,7 +266,7 @@ jobs:
# We need to ignore wheels otherwise we break universal2 builds
python3 -m pip install -U --no-binary :all: -r requirements.txt
# We need to fuse our own universal2 wheels for curl_cffi
python3 -m pip install -U delocate
python3 -m pip install -U 'delocate==0.11.0'
mkdir curl_cffi_whls curl_cffi_universal2
python3 devscripts/install_deps.py --print -o --include curl-cffi > requirements.txt
for platform in "macosx_11_0_arm64" "macosx_11_0_x86_64"; do
@@ -409,7 +409,7 @@ jobs:
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py --include curl-cffi
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.7.0-py3-none-any.whl"
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.10.0-py3-none-any.whl"
- name: Prepare
run: |
@@ -469,7 +469,7 @@ jobs:
run: |
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.7.0-py3-none-any.whl"
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.10.0-py3-none-any.whl"
- name: Prepare
run: |

View File

@@ -55,6 +55,7 @@ jobs:
- name: Install test requirements
run: python3 ./devscripts/install_deps.py --include test --include curl-cffi
- name: Run tests
timeout-minutes: 15
continue-on-error: False
run: |
python3 -m yt_dlp -v || true # Print debug head

21
.github/workflows/issue-lockdown.yml vendored Normal file
View File

@@ -0,0 +1,21 @@
name: Issue Lockdown
on:
issues:
types: [opened]
permissions:
issues: write
jobs:
lockdown:
name: Issue Lockdown
if: vars.ISSUE_LOCKDOWN
runs-on: ubuntu-latest
steps:
- name: "Lock new issue"
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ github.event.issue.number }}
REPOSITORY: ${{ github.repository }}
run: |
gh issue lock "${ISSUE_NUMBER}" -R "${REPOSITORY}"

View File

@@ -15,8 +15,9 @@ jobs:
with:
python-version: '3.8'
- name: Install test requirements
run: python3 ./devscripts/install_deps.py --include test
run: python3 ./devscripts/install_deps.py -o --include test
- name: Run tests
timeout-minutes: 15
run: |
python3 -m yt_dlp -v || true
python3 ./devscripts/run_tests.py core

View File

@@ -204,7 +204,7 @@ jobs:
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add -u
git commit -m "Release ${{ env.version }}" \
-m "Created by: ${{ github.event.sender.login }}" -m ":ci skip all :ci run dl"
-m "Created by: ${{ github.event.sender.login }}" -m ":ci skip all"
git push origin --force ${{ github.event.ref }}:release
- name: Get target commitish
@@ -325,7 +325,7 @@ jobs:
"(https://github.com/yt-dlp/yt-dlp-master-builds/releases/latest \"Master builds\")"' || '' }} > ./RELEASE_NOTES
printf '\n\n' >> ./RELEASE_NOTES
cat >> ./RELEASE_NOTES << EOF
#### A description of the various files are in the [README](https://github.com/${{ github.repository }}#release-files)
#### A description of the various files is in the [README](https://github.com/${{ github.repository }}#release-files)
---
$(python ./devscripts/make_changelog.py -vv --collapsible)
EOF

17
.github/workflows/sanitize-comment.yml vendored Normal file
View File

@@ -0,0 +1,17 @@
name: Sanitize comment
on:
issue_comment:
types: [created, edited]
permissions:
issues: write
jobs:
sanitize-comment:
name: Sanitize comment
if: vars.SANITIZE_COMMENT && !github.event.issue.pull_request
runs-on: ubuntu-latest
steps:
- name: Sanitize comment
uses: yt-dlp/sanitize-comment@v1

2
.gitignore vendored
View File

@@ -51,7 +51,6 @@ cookies
*.srt
*.ssa
*.swf
*.swp
*.tt
*.ttml
*.url
@@ -119,6 +118,7 @@ yt-dlp.zip
.vscode
*.sublime-*
*.code-workspace
*.swp
# Lazy extractors
*/extractor/lazy_extractors.py

View File

@@ -644,3 +644,32 @@ peisenwang
TheZ3ro
tippfehlr
varunchopra
DrakoCpp
PatrykMis
DinhHuy2010
exterrestris
harbhim
LeSuisse
DunnesH
iancmy
mokrueger
luvyana
szantnerb
hugepower
scribblemaniac
Codenade
Demon000
Deukhoofd
grqz
hibes
Khaoklong51
kieraneglin
lengzuo
naglis
ndyanx
otovalek
quad
rakslice
sahilsinghss73
tony-hn
xingchensong

View File

@@ -4,10 +4,247 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2024.09.27
#### Important changes
- **The minimum *recommended* Python version has been raised to 3.9**
Since Python 3.8 will reach end-of-life in October 2024, support for it will be dropped soon. [Read more](https://github.com/yt-dlp/yt-dlp/issues/10086)
#### Core changes
- [Allow `none` arg to negate `--convert-subs` and `--convert-thumbnails`](https://github.com/yt-dlp/yt-dlp/commit/c08e0b20b5edd8957b8318716bc14e896d1b96f4) ([#11066](https://github.com/yt-dlp/yt-dlp/issues/11066)) by [kieraneglin](https://github.com/kieraneglin)
- [Fix format sorting bug with vp9.2 vcodec](https://github.com/yt-dlp/yt-dlp/commit/8f4ea14680c7865d8ffac10a9174205d1d84ada7) ([#10884](https://github.com/yt-dlp/yt-dlp/issues/10884)) by [rakslice](https://github.com/rakslice)
- [Raise minimum recommended Python version to 3.9](https://github.com/yt-dlp/yt-dlp/commit/cca534cd9e6850c70244f225a4a1895ef4bcdbec) ([#11098](https://github.com/yt-dlp/yt-dlp/issues/11098)) by [bashonly](https://github.com/bashonly)
- **cookies**: [Improve error message for Windows `--cookies-from-browser chrome` issue](https://github.com/yt-dlp/yt-dlp/commit/b397a64691421ace5df09457c2a764821a2dc6f2) ([#11090](https://github.com/yt-dlp/yt-dlp/issues/11090)) by [seproDev](https://github.com/seproDev)
- **utils**: `mimetype2ext`: [Recognize `aacp` as `aac`](https://github.com/yt-dlp/yt-dlp/commit/cc85596d5b59f0c14e9381b3675f619c1e12e597) ([#10860](https://github.com/yt-dlp/yt-dlp/issues/10860)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- [Fix JW Player format parsing](https://github.com/yt-dlp/yt-dlp/commit/409f8e9e3b4bde81ef76fc563256f876d2ff8099) ([#10956](https://github.com/yt-dlp/yt-dlp/issues/10956)) by [seproDev](https://github.com/seproDev)
- [Handle decode errors when reading responses](https://github.com/yt-dlp/yt-dlp/commit/325001317d97f4545d66fac44c4ba772c6f45f22) ([#10868](https://github.com/yt-dlp/yt-dlp/issues/10868)) by [bashonly](https://github.com/bashonly)
- **abc.net.au**: iview, showseries: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/7f909046f4dc0fba472b4963145aef6e0d42491b) ([#11101](https://github.com/yt-dlp/yt-dlp/issues/11101)) by [bashonly](https://github.com/bashonly)
- **adn**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/cc88a54bb1ef285154775f8a6a413335ce4c71ce) ([#10749](https://github.com/yt-dlp/yt-dlp/issues/10749)) by [infanf](https://github.com/infanf)
- **asobistage**: [Support redirected URLs](https://github.com/yt-dlp/yt-dlp/commit/a7d3235c84dac57a127cbe0ff38f7f7c2fdd8fa0) ([#10768](https://github.com/yt-dlp/yt-dlp/issues/10768)) by [pzhlkj6612](https://github.com/pzhlkj6612)
- **bandcamp**: user: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/5d0176547f16a3642cd71627126e9dfc24981e20) ([#10328](https://github.com/yt-dlp/yt-dlp/issues/10328)) by [bashonly](https://github.com/bashonly), [quad](https://github.com/quad)
- **beacon**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/b4760c778d0c92c6e3f2bc8346cd72c8f08595ae) ([#9901](https://github.com/yt-dlp/yt-dlp/issues/9901)) by [Deukhoofd](https://github.com/Deukhoofd)
- **bilibili**
- [Fix chapters and subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/a2000bc85730c950351d78bb818493dc39dca3cb) ([#11099](https://github.com/yt-dlp/yt-dlp/issues/11099)) by [bashonly](https://github.com/bashonly)
- [Fix festival URL support](https://github.com/yt-dlp/yt-dlp/commit/b43bd864851f2862e26caa85461c5d825d49d463) ([#10740](https://github.com/yt-dlp/yt-dlp/issues/10740)) by [bashonly](https://github.com/bashonly), [grqz](https://github.com/grqz)
- **biliintl**: [Fix referer header](https://github.com/yt-dlp/yt-dlp/commit/a06bb586795ebab87a2356923acfc674d6f0e152) ([#11003](https://github.com/yt-dlp/yt-dlp/issues/11003)) by [Khaoklong51](https://github.com/Khaoklong51)
- **dropbox**: [Fix password-protected video support](https://github.com/yt-dlp/yt-dlp/commit/63da31b3b29af90062d8a72a905ffe4b5e499042) ([#10735](https://github.com/yt-dlp/yt-dlp/issues/10735)) by [ndyanx](https://github.com/ndyanx)
- **ertgr**: [Fix video extraction](https://github.com/yt-dlp/yt-dlp/commit/416686ed0cf792ec44ab059f3b229dd776077e14) ([#11091](https://github.com/yt-dlp/yt-dlp/issues/11091)) by [seproDev](https://github.com/seproDev)
- **eurosport**: [Support local URL variants](https://github.com/yt-dlp/yt-dlp/commit/f0bb28504c8c2b75ee3e5796aed50de2a7f90a1b) ([#10785](https://github.com/yt-dlp/yt-dlp/issues/10785)) by [seproDev](https://github.com/seproDev)
- **facebook**
- ads: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/d62fef7e07d454c0d2ba2d69fb96d691dba1ded0) ([#10704](https://github.com/yt-dlp/yt-dlp/issues/10704)) by [kclauhk](https://github.com/kclauhk)
- reel: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/0e1b941c6b2caa688b0d3332e723d16dbafa4311) by [lengzuo](https://github.com/lengzuo)
- **germanupa**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/124f058b546d652a359c67025bb479789bfbef0b) ([#10538](https://github.com/yt-dlp/yt-dlp/issues/10538)) by [grqz](https://github.com/grqz)
- **hgtvde**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/a555389c9bb32e589e00b4664974423fb7b04dcd) ([#10992](https://github.com/yt-dlp/yt-dlp/issues/10992)) by [bashonly](https://github.com/bashonly), [rdamas](https://github.com/rdamas)
- **huya**: video: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/25c1cdaa2650563494d3bf00a38f72d0d9486bff) ([#10686](https://github.com/yt-dlp/yt-dlp/issues/10686)) by [hugepower](https://github.com/hugepower)
- **iprima**: [Fix zoom URL support](https://github.com/yt-dlp/yt-dlp/commit/4a27b8f092f7f7c10b7a334d3535c97c2af02f0a) ([#10959](https://github.com/yt-dlp/yt-dlp/issues/10959)) by [otovalek](https://github.com/otovalek)
- **khanacademy**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/0fba08485b6445b72b5b63ae23ca2a73fa5d967f) ([#10913](https://github.com/yt-dlp/yt-dlp/issues/10913)) by [seproDev](https://github.com/seproDev)
- **kick**
- clips: [Support new URL format](https://github.com/yt-dlp/yt-dlp/commit/0aa4426e9a35f7f8e184f1f2082b3b313c1448f7) ([#11107](https://github.com/yt-dlp/yt-dlp/issues/11107)) by [bashonly](https://github.com/bashonly)
- vod: [Support new URL format](https://github.com/yt-dlp/yt-dlp/commit/173d54c151b987409e3eb09552d8d89ed8fc50f7) ([#10988](https://github.com/yt-dlp/yt-dlp/issues/10988)) by [bashonly](https://github.com/bashonly), [grqz](https://github.com/grqz)
- **kika**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/e6f48ca80821939c1fd11ec2a0cdbf2fba9b258a) ([#5788](https://github.com/yt-dlp/yt-dlp/issues/5788)) by [1100101](https://github.com/1100101)
- **lnkgo**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/fa83d0b36bc43d30fe9241c1e923f4614864b758) ([#10904](https://github.com/yt-dlp/yt-dlp/issues/10904)) by [naglis](https://github.com/naglis)
- **loom**: [Fix m3u8 formats extraction](https://github.com/yt-dlp/yt-dlp/commit/7509d692b37a7ec6230ea75bfe1e44a8de5eefce) ([#10760](https://github.com/yt-dlp/yt-dlp/issues/10760)) by [kclauhk](https://github.com/kclauhk)
- **mediaklikk**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e2b3634e299be9c16a247ece3b1858d83889c324) ([#11083](https://github.com/yt-dlp/yt-dlp/issues/11083)) by [szantnerb](https://github.com/szantnerb)
- **mojevideo**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/28b0ecba2af5b4919f198474b3d00a76ef322c31) ([#11019](https://github.com/yt-dlp/yt-dlp/issues/11019)) by [04-pasha-04](https://github.com/04-pasha-04), [pzhlkj6612](https://github.com/pzhlkj6612)
- **niconico**: [Fix m3u8 formats extraction](https://github.com/yt-dlp/yt-dlp/commit/eabb4680fdb09ba1f48d174a700a2e3b43f82add) ([#11103](https://github.com/yt-dlp/yt-dlp/issues/11103)) by [bashonly](https://github.com/bashonly)
- **nzz**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/4a9bc8c3630378bc29f0266126b503f6190c0430) ([#10461](https://github.com/yt-dlp/yt-dlp/issues/10461)) by [1-Byte](https://github.com/1-Byte)
- **patreoncampaign**: [Support API URLs](https://github.com/yt-dlp/yt-dlp/commit/232e6db30c474d1b387e405342f34173ceeaf832) ([#10734](https://github.com/yt-dlp/yt-dlp/issues/10734)) by [bashonly](https://github.com/bashonly), [hibes](https://github.com/hibes)
- **pinterest**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/c8c078fe28b0ffc15ef9646346c00c592fe71a78) ([#10867](https://github.com/yt-dlp/yt-dlp/issues/10867)) by [bashonly](https://github.com/bashonly), [sahilsinghss73](https://github.com/sahilsinghss73)
- **radiko**: [Extract unique `id` values](https://github.com/yt-dlp/yt-dlp/commit/c8d096c5ce111411fbdbe2abb8fed54f317a6182) ([#10726](https://github.com/yt-dlp/yt-dlp/issues/10726)) by [garret1317](https://github.com/garret1317)
- **rtp**: [Support more subpages](https://github.com/yt-dlp/yt-dlp/commit/d02df303d8e49390599db9f34482697e4d1cf5b2) ([#10787](https://github.com/yt-dlp/yt-dlp/issues/10787)) by [Demon000](https://github.com/Demon000)
- **rumblechannel**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/ad0b857f459a6d390fbf124183916218c52f223a) ([#11049](https://github.com/yt-dlp/yt-dlp/issues/11049)) by [tony-hn](https://github.com/tony-hn)
- **rutube**: [Support livestreams](https://github.com/yt-dlp/yt-dlp/commit/41be32e78c3845000dbac188ffb90ea3ea7c4dfa) ([#10844](https://github.com/yt-dlp/yt-dlp/issues/10844)) by [pzhlkj6612](https://github.com/pzhlkj6612)
- **samplefocus**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/46f4c80bc363ee8116c33d37f65202e6c3470954) ([#10947](https://github.com/yt-dlp/yt-dlp/issues/10947)) by [seproDev](https://github.com/seproDev)
- **screenrec**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/36f9e602ad55679764bc75a4f67f7562b1d6adcf) ([#10917](https://github.com/yt-dlp/yt-dlp/issues/10917)) by [naglis](https://github.com/naglis)
- **sen**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/41a241ca6ffb95b3d9aaf4f42106ca8cba9af1a6) ([#10952](https://github.com/yt-dlp/yt-dlp/issues/10952)) by [seproDev](https://github.com/seproDev)
- **servus**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/300c91274f7ea5b1b0528fc5ee11cf1a61d4079e) ([#10944](https://github.com/yt-dlp/yt-dlp/issues/10944)) by [seproDev](https://github.com/seproDev)
- **snapchatspotlight**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/b37417e4f934fd8909788b493d017777155b0ae5) ([#11030](https://github.com/yt-dlp/yt-dlp/issues/11030)) by [seproDev](https://github.com/seproDev)
- **svtpage**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/5a8a05aebb49693e78e1123015837ed5e961ff76) ([#11010](https://github.com/yt-dlp/yt-dlp/issues/11010)) by [diman8](https://github.com/diman8)
- **tenplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/d8d473002b654ab0e7b97ead869f58b4361eeae1) ([#10928](https://github.com/yt-dlp/yt-dlp/issues/10928)) by [aarubui](https://github.com/aarubui)
- **tiktok**: [Fix web formats extraction](https://github.com/yt-dlp/yt-dlp/commit/3ad0b7f422d547204df687b6d0b2d9110fff3990) ([#11074](https://github.com/yt-dlp/yt-dlp/issues/11074)) by [bashonly](https://github.com/bashonly)
- **twitter**: spaces: [Support video spaces](https://github.com/yt-dlp/yt-dlp/commit/bef1d4d6fc9493fda7f75e2289c07c507d10092f) ([#10789](https://github.com/yt-dlp/yt-dlp/issues/10789)) by [bashonly](https://github.com/bashonly)
- **vidflex**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/e978c312d6550a6ae4c9df18001afb1b420cb72f) ([#10002](https://github.com/yt-dlp/yt-dlp/issues/10002)) by [pzhlkj6612](https://github.com/pzhlkj6612)
- **vimeo**
- [Always try to extract original format](https://github.com/yt-dlp/yt-dlp/commit/4115c24d157c5b5f63089d75c4e0f51d1f8b4489) ([#10721](https://github.com/yt-dlp/yt-dlp/issues/10721)) by [bashonly](https://github.com/bashonly) (With fixes in [e8e6a98](https://github.com/yt-dlp/yt-dlp/commit/e8e6a982a1b659eed434d225d7922f632bac6568) by [seproDev](https://github.com/seproDev))
- [Fix HLS audio format sorting](https://github.com/yt-dlp/yt-dlp/commit/a1b4ac2b8ed8e6eaa56044d439f1e0d00c2ba218) ([#11082](https://github.com/yt-dlp/yt-dlp/issues/11082)) by [fireattack](https://github.com/fireattack)
- **watchespn**: [Improve auth support](https://github.com/yt-dlp/yt-dlp/commit/7adff8caf152dcf96d03aff69ed8545c0a63567c) ([#10910](https://github.com/yt-dlp/yt-dlp/issues/10910)) by [ischmidt20](https://github.com/ischmidt20)
- **wistia**: [Support password-protected videos](https://github.com/yt-dlp/yt-dlp/commit/9f5c9a90898c5a1e672922d9cd799716c73cee34) ([#11100](https://github.com/yt-dlp/yt-dlp/issues/11100)) by [bashonly](https://github.com/bashonly)
- **ximalaya**: [Add VIP support](https://github.com/yt-dlp/yt-dlp/commit/3dfd720d098b4d49d69cfc77e6376f22bcd90934) ([#10832](https://github.com/yt-dlp/yt-dlp/issues/10832)) by [seproDev](https://github.com/seproDev), [xingchensong](https://github.com/xingchensong)
- **xinpianchang**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/3aa0156e05662923d130ddbc1c82596e38c01a00) ([#10950](https://github.com/yt-dlp/yt-dlp/issues/10950)) by [seproDev](https://github.com/seproDev)
- **yleareena**: [Support podcasts](https://github.com/yt-dlp/yt-dlp/commit/48d629d461e05b1b19f5e53dc959bb9ebe95da42) ([#11104](https://github.com/yt-dlp/yt-dlp/issues/11104)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Add `po_token`, `visitor_data`, `data_sync_id` extractor args](https://github.com/yt-dlp/yt-dlp/commit/3a3bd00037e9908e87da4fa9f2ad772aa34dc60e) ([#10648](https://github.com/yt-dlp/yt-dlp/issues/10648)) by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz), [seproDev](https://github.com/seproDev) (With fixes in [fa2be9a](https://github.com/yt-dlp/yt-dlp/commit/fa2be9a7c63babede07480151363e54eee5702bd) by [bashonly](https://github.com/bashonly))
- [Support excluding `player_client`s in extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/49f3741a820ed142f6866317c2e7d247b130960e) ([#10710](https://github.com/yt-dlp/yt-dlp/issues/10710)) by [bashonly](https://github.com/bashonly)
- clip: [Prioritize `https` formats](https://github.com/yt-dlp/yt-dlp/commit/1d84b780cf33a1d84756825ac23f990a905703df) ([#11102](https://github.com/yt-dlp/yt-dlp/issues/11102)) by [bashonly](https://github.com/bashonly)
- tab: [Fix shorts tab extraction](https://github.com/yt-dlp/yt-dlp/commit/9431777b4c37129a6093080c77ca59960afbb9d7) ([#10938](https://github.com/yt-dlp/yt-dlp/issues/10938)) by [seproDev](https://github.com/seproDev)
#### Networking changes
- [Fix handler not being added to RequestError](https://github.com/yt-dlp/yt-dlp/commit/d1c4d88b2d912e8da5e76db455562ca63b1af690) ([#10955](https://github.com/yt-dlp/yt-dlp/issues/10955)) by [coletdjnz](https://github.com/coletdjnz)
- [Pin `curl-cffi` version to < 0.7.2](https://github.com/yt-dlp/yt-dlp/commit/5bb1aa04dafce13ba9de707ea53169fab58b5207) ([#11092](https://github.com/yt-dlp/yt-dlp/issues/11092)) by [bashonly](https://github.com/bashonly)
- **Request Handler**: websockets: [Upgrade websockets to 13.0](https://github.com/yt-dlp/yt-dlp/commit/6f9e6537434562d513d0c9b68ced8a61ade94a64) ([#10815](https://github.com/yt-dlp/yt-dlp/issues/10815)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **build**
- [Bump PyInstaller version pin to `>=6.10.0`](https://github.com/yt-dlp/yt-dlp/commit/fb8b7f226d251e521a89b23c415e249e5b788e5c) ([#10709](https://github.com/yt-dlp/yt-dlp/issues/10709)) by [bashonly](https://github.com/bashonly)
- [Pin `delocate` version for `macos`](https://github.com/yt-dlp/yt-dlp/commit/7e41628ff523b3fe373b0981a5db441358980dab) ([#10901](https://github.com/yt-dlp/yt-dlp/issues/10901)) by [bashonly](https://github.com/bashonly)
- **ci**
- [Add comment sanitization workflow](https://github.com/yt-dlp/yt-dlp/commit/b6200bdcf3a9415ae36859188f9a57e3e461c696) ([#10915](https://github.com/yt-dlp/yt-dlp/issues/10915)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
- [Add issue tracker anti-spam protection](https://github.com/yt-dlp/yt-dlp/commit/ad9a8115aa29a1a95c961b16fcf129a228d98f50) ([#10861](https://github.com/yt-dlp/yt-dlp/issues/10861)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [c6387ab](https://github.com/yt-dlp/yt-dlp/commit/c6387abc1af9842bb0541288a5610abba9b1ab51) by [bashonly](https://github.com/bashonly), [Codenade](https://github.com/Codenade), [coletdjnz](https://github.com/coletdjnz), [grqz](https://github.com/grqz), [Grub4K](https://github.com/Grub4K), [pzhlkj6612](https://github.com/pzhlkj6612), [seproDev](https://github.com/seproDev)
### 2024.08.06
#### Core changes
- **jsinterp**: [Improve `slice` implementation](https://github.com/yt-dlp/yt-dlp/commit/bb8bf1db993f59752d20b73b861bd55e40cf0e31) ([#10664](https://github.com/yt-dlp/yt-dlp/issues/10664)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **discoveryplusitaly**: [Support sport and olympics URLs](https://github.com/yt-dlp/yt-dlp/commit/e7d73bc4531ee3f91a46b15e218dcc1fbeb6226c) ([#10655](https://github.com/yt-dlp/yt-dlp/issues/10655)) by [bashonly](https://github.com/bashonly)
- **gem.cbc.ca**: live: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/fc5eecfa31c9571b6031cc3968aaa0394be55d7a) ([#10565](https://github.com/yt-dlp/yt-dlp/issues/10565)) by [bashonly](https://github.com/bashonly), [scribblemaniac](https://github.com/scribblemaniac)
- **niconico**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/4d9231208332d4c32364b8cd814bff8b20232cae) ([#10677](https://github.com/yt-dlp/yt-dlp/issues/10677)) by [bashonly](https://github.com/bashonly)
- **olympics**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/919540a9644e55deb78cdd6751757ec8fdaf76f4) ([#10625](https://github.com/yt-dlp/yt-dlp/issues/10625)) by [bashonly](https://github.com/bashonly)
- **youku**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/0088c6de23d832b117061a33e984dc452d992e9c) ([#10626](https://github.com/yt-dlp/yt-dlp/issues/10626)) by [hugepower](https://github.com/hugepower)
- **youtube**
- [Change default player clients to `ios,web_creator`](https://github.com/yt-dlp/yt-dlp/commit/406f4c2e47502fffc1b0c210b4ee6487c89a44cb) ([#10674](https://github.com/yt-dlp/yt-dlp/issues/10674)) by [bashonly](https://github.com/bashonly)
- [Fix `n` function name extraction for player `b12cc44b`](https://github.com/yt-dlp/yt-dlp/commit/c86891eb9434b4d7eec426d38c0c625b5e13cb2f) ([#10668](https://github.com/yt-dlp/yt-dlp/issues/10668)) by [seproDev](https://github.com/seproDev)
### 2024.08.01
#### Core changes
- **utils**: `unified_timestamp`: [Recognize Sunday](https://github.com/yt-dlp/yt-dlp/commit/6daf2c27c0464fba98337be30de0b66d520d0db1) ([#10589](https://github.com/yt-dlp/yt-dlp/issues/10589)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- **abematv**: [Fix availability extraction](https://github.com/yt-dlp/yt-dlp/commit/ef36d517f9b05785d61abca7691d9ab7d63cc75c) ([#10569](https://github.com/yt-dlp/yt-dlp/issues/10569)) by [middlingphys](https://github.com/middlingphys)
- **cbc.ca**: player: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/94a1c5e642e468cebeb51f74c6c220434cb47d96) ([#10302](https://github.com/yt-dlp/yt-dlp/issues/10302)) by [bashonly](https://github.com/bashonly), [trainman261](https://github.com/trainman261)
- **discoveryplus**: [Support olympics URLs](https://github.com/yt-dlp/yt-dlp/commit/0b7728618417e1aa382722a4d29b916b594d4459) ([#10566](https://github.com/yt-dlp/yt-dlp/issues/10566)) by [bashonly](https://github.com/bashonly)
- **kick**: clips: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb3936ae2b3ce96d0b53f9e17cad1082058f032b) ([#10572](https://github.com/yt-dlp/yt-dlp/issues/10572)) by [luvyana](https://github.com/luvyana)
- **learningonscreen**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/fe15d3178e242803ae7a934b90137f13598eba2e) ([#10590](https://github.com/yt-dlp/yt-dlp/issues/10590)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
- **mediaklikk**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7e3e4779ad13e4511c9ba3869879e53f0267bd7a) ([#10605](https://github.com/yt-dlp/yt-dlp/issues/10605)) by [szantnerb](https://github.com/szantnerb)
- **mlbtv**: [Fix makeup game extraction](https://github.com/yt-dlp/yt-dlp/commit/4b69e1b53ea21e631cd5dd68ff531e2f1671ec17) ([#10607](https://github.com/yt-dlp/yt-dlp/issues/10607)) by [bashonly](https://github.com/bashonly)
- **olympics**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2f1ddfe12a2c174bc777264c5c8ffe7ca0922d94) ([#10604](https://github.com/yt-dlp/yt-dlp/issues/10604)) by [bashonly](https://github.com/bashonly)
- **tva**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/28d485714fef88937c82635438afba5db81f9089) ([#10567](https://github.com/yt-dlp/yt-dlp/issues/10567)) by [bashonly](https://github.com/bashonly)
- **tver**: [Support olympic URLs](https://github.com/yt-dlp/yt-dlp/commit/5260696b1cba77161828941fdb38f09f14ac6c60) ([#10600](https://github.com/yt-dlp/yt-dlp/issues/10600)) by [vvto33](https://github.com/vvto33)
- **vimeo**: review: [Fix password-protected video extraction](https://github.com/yt-dlp/yt-dlp/commit/2b6df93a243bdfb9d6bb5c1e18020625cd02d465) ([#10598](https://github.com/yt-dlp/yt-dlp/issues/10598)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Change default player clients to `ios,tv`](https://github.com/yt-dlp/yt-dlp/commit/efb42763dec23ccf6a2e3bac3afbfefce8efd012) ([#10457](https://github.com/yt-dlp/yt-dlp/issues/10457)) by [seproDev](https://github.com/seproDev)
- [Fix `n` function name extraction for player `20dfca59`](https://github.com/yt-dlp/yt-dlp/commit/011b4a04db2a636c3ef0a0ad4e2d3ae482c9fd76) ([#10611](https://github.com/yt-dlp/yt-dlp/issues/10611)) by [bashonly](https://github.com/bashonly)
- [Fix age-verification workaround](https://github.com/yt-dlp/yt-dlp/commit/d19fcb934269465fd707e68a87f735ec6983e93d) ([#10610](https://github.com/yt-dlp/yt-dlp/issues/10610)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/0e539617a41913c7da1edd74fb6543c10ad727b3) ([#10573](https://github.com/yt-dlp/yt-dlp/issues/10573)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **cleanup**: Miscellaneous: [ffd7781](https://github.com/yt-dlp/yt-dlp/commit/ffd7781d6588926f820b44a34b9e6e3068fb9f97) by [bashonly](https://github.com/bashonly)
### 2024.07.25
#### Extractor changes
- **abematv**: [Adapt key retrieval to request handler framework](https://github.com/yt-dlp/yt-dlp/commit/a3bab4752a2b3d56e5a59b4e0411bb8f695c010b) ([#10491](https://github.com/yt-dlp/yt-dlp/issues/10491)) by [bashonly](https://github.com/bashonly)
- **facebook**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/1a34a802f44a1dab8f642c79c3cc810e21541d3b) ([#10531](https://github.com/yt-dlp/yt-dlp/issues/10531)) by [bashonly](https://github.com/bashonly)
- **mlbtv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f0993391e6052ec8f7aacc286609564f226943b9) ([#10515](https://github.com/yt-dlp/yt-dlp/issues/10515)) by [bashonly](https://github.com/bashonly)
- **tiktok**: [Fix and deprioritize JSON subtitles](https://github.com/yt-dlp/yt-dlp/commit/2f97779f335ac069ecccd9c7bf81abf4a83cfe7a) ([#10516](https://github.com/yt-dlp/yt-dlp/issues/10516)) by [bashonly](https://github.com/bashonly)
- **vimeo**: [Fix chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/a0a1bc3d8d8e3bb9a48a06e835815a0460e90e77) ([#10544](https://github.com/yt-dlp/yt-dlp/issues/10544)) by [bashonly](https://github.com/bashonly)
- **youtube**: [Fix `n` function name extraction for player `3400486c`](https://github.com/yt-dlp/yt-dlp/commit/713b4cd18f00556771af8cfdd9cea6cc1a09e948) ([#10542](https://github.com/yt-dlp/yt-dlp/issues/10542)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**: [Pin `setuptools` version](https://github.com/yt-dlp/yt-dlp/commit/e046db8a116b1c320d4785daadd48ea0b22a3987) ([#10493](https://github.com/yt-dlp/yt-dlp/issues/10493)) by [bashonly](https://github.com/bashonly)
### 2024.07.16
#### Core changes
- [Fix `noprogress` if `test=True` with `--quiet` and `--verbose`](https://github.com/yt-dlp/yt-dlp/commit/66ce3d76d87af3f81cc9dfec4be4704016cb1cdb) ([#10454](https://github.com/yt-dlp/yt-dlp/issues/10454)) by [Grub4K](https://github.com/Grub4K)
- [Support `auto-tty` and `no_color-tty` for `--color`](https://github.com/yt-dlp/yt-dlp/commit/d9cbced493cae2008508d94a2db5dd98be7c01fc) ([#10453](https://github.com/yt-dlp/yt-dlp/issues/10453)) by [Grub4K](https://github.com/Grub4K)
- **update**: [Fix network error handling](https://github.com/yt-dlp/yt-dlp/commit/ed1b9ed93dd90d2cc960c0d8eaa9d919db224203) ([#10486](https://github.com/yt-dlp/yt-dlp/issues/10486)) by [bashonly](https://github.com/bashonly)
- **utils**: `parse_codecs`: [Fix parsing of mixed case codec strings](https://github.com/yt-dlp/yt-dlp/commit/cc0070f6496e501d77352bad475fb02d6a86846a) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- **adn**: [Adjust for .com domain change](https://github.com/yt-dlp/yt-dlp/commit/959b7a379b8e5da059d110a63339c964b6265736) ([#10399](https://github.com/yt-dlp/yt-dlp/issues/10399)) by [infanf](https://github.com/infanf)
- **afreecatv**: [Fix login and use `legacy_ssl`](https://github.com/yt-dlp/yt-dlp/commit/4cd41469243624d90b7a2009b95cbe0609343efe) ([#10440](https://github.com/yt-dlp/yt-dlp/issues/10440)) by [bashonly](https://github.com/bashonly)
- **box**: [Support enterprise URLs](https://github.com/yt-dlp/yt-dlp/commit/705f5b84dec75cc7af97f42fd1530e8062735970) ([#10419](https://github.com/yt-dlp/yt-dlp/issues/10419)) by [seproDev](https://github.com/seproDev)
- **digitalconcerthall**: [Extract HEVC and FLAC formats](https://github.com/yt-dlp/yt-dlp/commit/e62fa6b0e0186f8c5666c2c5ab64cf191abdafc1) ([#10470](https://github.com/yt-dlp/yt-dlp/issues/10470)) by [bashonly](https://github.com/bashonly)
- **dplay**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/39e6c4cb44b9292e89ac0afec3cd0afc2ae8775f) ([#10471](https://github.com/yt-dlp/yt-dlp/issues/10471)) by [bashonly](https://github.com/bashonly)
- **epidemicsound**: [Support sound effects URLs](https://github.com/yt-dlp/yt-dlp/commit/8531d2b03bac9cc746f2ee8098aaf8f115505f5b) ([#10436](https://github.com/yt-dlp/yt-dlp/issues/10436)) by [iancmy](https://github.com/iancmy)
- **generic**: [Fix direct video link extensions](https://github.com/yt-dlp/yt-dlp/commit/b9afb99e7c34d0eb15ddc6689cd7d20eebfda68e) ([#10468](https://github.com/yt-dlp/yt-dlp/issues/10468)) by [bashonly](https://github.com/bashonly)
- **picarto**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/bacd18b7df08b4995644fd12cee1f8c8e8636bc7) ([#10414](https://github.com/yt-dlp/yt-dlp/issues/10414)) by [Frankgoji](https://github.com/Frankgoji)
- **soundcloud**: permalink, user: [Extract tracks only](https://github.com/yt-dlp/yt-dlp/commit/22870b81bad97dfa6307a7add44753b2dffc76a9) ([#10463](https://github.com/yt-dlp/yt-dlp/issues/10463)) by [DunnesH](https://github.com/DunnesH)
- **tiktok**: live: [Fix room ID extraction](https://github.com/yt-dlp/yt-dlp/commit/d2189d3d36987ebeac426fd70a60a5fe86325a2b) ([#10408](https://github.com/yt-dlp/yt-dlp/issues/10408)) by [mokrueger](https://github.com/mokrueger)
- **tv5monde**: [Support browser impersonation](https://github.com/yt-dlp/yt-dlp/commit/9b95a6765a5f6325af99c4aca961587f0c426e8c) ([#10417](https://github.com/yt-dlp/yt-dlp/issues/10417)) by [bashonly](https://github.com/bashonly) (With fixes in [cc1a309](https://github.com/yt-dlp/yt-dlp/commit/cc1a3098c00995c6aebc2a16bd1050a66bad64db))
- **youtube**
- [Avoid poToken experiment player responses](https://github.com/yt-dlp/yt-dlp/commit/8b8b442cb005a8d85315f301615f83fb736b967a) ([#10456](https://github.com/yt-dlp/yt-dlp/issues/10456)) by [seproDev](https://github.com/seproDev) (With fixes in [16da8ef](https://github.com/yt-dlp/yt-dlp/commit/16da8ef9937ff76632dfef02e5062c5ba99c8ea2))
- [Invalidate nsig cache from < 2024.07.09](https://github.com/yt-dlp/yt-dlp/commit/04e17ba20a139f1b3e30ec4bafa3fba26888f0b3) ([#10401](https://github.com/yt-dlp/yt-dlp/issues/10401)) by [bashonly](https://github.com/bashonly)
- [Reduce android client priority](https://github.com/yt-dlp/yt-dlp/commit/b85eef0a615a01304f88a3847309c667e09a20df) ([#10467](https://github.com/yt-dlp/yt-dlp/issues/10467)) by [seproDev](https://github.com/seproDev)
#### Networking changes
- [Add `legacy_ssl` request extension](https://github.com/yt-dlp/yt-dlp/commit/150ecc45d9cacc919550c13b04fd998ac5103a6b) ([#10448](https://github.com/yt-dlp/yt-dlp/issues/10448)) by [coletdjnz](https://github.com/coletdjnz)
- **Request Handler**: curl_cffi: [Support `curl_cffi` 0.7.X](https://github.com/yt-dlp/yt-dlp/commit/42bfca00a6b460fc053514cdd7ac6f5b5daddf0c) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **build**
- [Include `curl_cffi` in `yt-dlp_linux`](https://github.com/yt-dlp/yt-dlp/commit/4521f30d1479315cd5c3bf4abdad19391952df98) by [bashonly](https://github.com/bashonly)
- [Pin `curl-cffi` to 0.5.10 for Windows](https://github.com/yt-dlp/yt-dlp/commit/ac30941ae682f71eab010877c9a977736a61d3cf) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [89a161e](https://github.com/yt-dlp/yt-dlp/commit/89a161e8c62569a662deda1c948664152efcb6b4) by [bashonly](https://github.com/bashonly)
### 2024.07.09
#### Core changes
- [Do not alter default format selection when simulated](https://github.com/yt-dlp/yt-dlp/commit/0b570f2a90ce2363ba06089217514d644e7be2e0) ([#9862](https://github.com/yt-dlp/yt-dlp/issues/9862)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **youtube**: [Remove broken `n` function extraction fallback](https://github.com/yt-dlp/yt-dlp/commit/7ead7332af69422cee931aec3faa277288e9e212) ([#10396](https://github.com/yt-dlp/yt-dlp/issues/10396)) by [pukkandan](https://github.com/pukkandan), [seproDev](https://github.com/seproDev)
### 2024.07.08
#### Core changes
- **jsinterp**: [Implement `Function.prototype` resolving for `call` and `apply`](https://github.com/yt-dlp/yt-dlp/commit/6c056ea7aeb03660281653a9668547f2548f194f) ([#10392](https://github.com/yt-dlp/yt-dlp/issues/10392)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- **soundcloud**: [Fix rate-limit handling](https://github.com/yt-dlp/yt-dlp/commit/4b50b292cc98534fb8c7cdf0ae5cb85862f7ebfc) ([#10389](https://github.com/yt-dlp/yt-dlp/issues/10389)) by [bashonly](https://github.com/bashonly)
- **youtube**: [Fix JS `n` function name extraction](https://github.com/yt-dlp/yt-dlp/commit/297b0a379282a15c80d82d51f3757c961db2dae1) ([#10390](https://github.com/yt-dlp/yt-dlp/issues/10390)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2024.07.07
#### Important changes
- Security: [[ie/douyutv] Do not use dangerous javascript source/URL](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-3v33-3wmw-3785)
- A dependency on potentially malicious third-party JavaScript code has been removed from the Douyu extractors
#### Core changes
- [Address gaps in allowed extensions](https://github.com/yt-dlp/yt-dlp/commit/2469119490d7e0397ebbf5c5ae327316f955eef2) ([#10362](https://github.com/yt-dlp/yt-dlp/issues/10362)) by [bashonly](https://github.com/bashonly)
- [Fix `--ignore-no-formats-error`](https://github.com/yt-dlp/yt-dlp/commit/cc767e9490056efaaa11c186b0d032e4b4969180) ([#10345](https://github.com/yt-dlp/yt-dlp/issues/10345)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- **abematv**: [Extract availability](https://github.com/yt-dlp/yt-dlp/commit/2a1a1b8e67e864289ac7ba5d05ec63dbb19a639f) ([#10348](https://github.com/yt-dlp/yt-dlp/issues/10348)) by [middlingphys](https://github.com/middlingphys)
- **chzzk**: [Extract with API v3](https://github.com/yt-dlp/yt-dlp/commit/4862a29854d4044120e3f97b52199711ad04bee1) ([#10363](https://github.com/yt-dlp/yt-dlp/issues/10363)) by [hui1601](https://github.com/hui1601)
- **douyutv**: [Do not use dangerous javascript source/URL](https://github.com/yt-dlp/yt-dlp/commit/6075a029dba70a89675ae1250e7cdfd91f0eba41) ([#10347](https://github.com/yt-dlp/yt-dlp/issues/10347)) by [LeSuisse](https://github.com/LeSuisse)
- **jiosaavn**: playlist: [Support featured playlists](https://github.com/yt-dlp/yt-dlp/commit/f0f867f008a1728f5f6ac1224b9e014b5d27f817) ([#10382](https://github.com/yt-dlp/yt-dlp/issues/10382)) by [harbhim](https://github.com/harbhim)
- **vidyard**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/00766ece0c5c7a80781a4ff677198c5fb69d9dc0) ([#10155](https://github.com/yt-dlp/yt-dlp/issues/10155)) by [exterrestris](https://github.com/exterrestris)
- **vimeo**: [Fix password-protected video extraction](https://github.com/yt-dlp/yt-dlp/commit/c1c9bb4adb42d0d93a2fb5d93a7de0a87b6ba884) ([#10341](https://github.com/yt-dlp/yt-dlp/issues/10341)) by [bashonly](https://github.com/bashonly)
- **vtv**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/987a1f94c24275f2b0cd82e719956687415dd732) ([#10173](https://github.com/yt-dlp/yt-dlp/issues/10173)) by [DinhHuy2010](https://github.com/DinhHuy2010)
- **yle_areena**
- [Fix metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/4cdc976bd861b5835601ae402bef543eacd88f3d) ([#10380](https://github.com/yt-dlp/yt-dlp/issues/10380)) by [seproDev](https://github.com/seproDev)
- [Fix subtitle extraction](https://github.com/yt-dlp/yt-dlp/commit/0d174e8bed32081eb38ef7f5d1a1282ae154f517) ([#10379](https://github.com/yt-dlp/yt-dlp/issues/10379)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [b337d29](https://github.com/yt-dlp/yt-dlp/commit/b337d2989ce0614651d363383f6f743d977248ef) by [bashonly](https://github.com/bashonly)
### 2024.07.02
#### Core changes
- [Fix `--compat-opt allow-unsafe-ext`](https://github.com/yt-dlp/yt-dlp/commit/773bbb181506856ffda95496ab60c1c9603f1f71) ([#10336](https://github.com/yt-dlp/yt-dlp/issues/10336)) by [bashonly](https://github.com/bashonly), [rdamas](https://github.com/rdamas)
#### Extractor changes
- **banbye**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7509791385ba88cb7ec0ab17e826681f4af4b66e) ([#10332](https://github.com/yt-dlp/yt-dlp/issues/10332)) by [PatrykMis](https://github.com/PatrykMis), [seproDev](https://github.com/seproDev)
- **murrtube**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/6403530e2dfe259a87afe444708c4f3024cc45b8) ([#9249](https://github.com/yt-dlp/yt-dlp/issues/9249)) by [DrakoCpp](https://github.com/DrakoCpp)
- **zaiko**: [Support JWT video URLs](https://github.com/yt-dlp/yt-dlp/commit/7799e518956387bb3c1064c9beae26eab8d5044a) ([#10130](https://github.com/yt-dlp/yt-dlp/issues/10130)) by [pzhlkj6612](https://github.com/pzhlkj6612)
#### Postprocessor changes
- **embedthumbnail**: [Fix embedding with mutagen](https://github.com/yt-dlp/yt-dlp/commit/d502f4c6d95b74896f40070d07229997f0850f31) ([#10337](https://github.com/yt-dlp/yt-dlp/issues/10337)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **cleanup**: Miscellaneous: [93d33cb](https://github.com/yt-dlp/yt-dlp/commit/93d33cb29af9e2e84369ac43589d50ce8e0160ef) by [bashonly](https://github.com/bashonly)
### 2024.07.01
#### Important changes
- Security: [[CVE-2024-10123](https://nvd.nist.gov/vuln/detail/CVE-2024-10123)] [Properly sanitize file-extension to prevent file system modification and RCE](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-79w7-vh3h-8g4j)
- Security: [[CVE-2024-38519](https://nvd.nist.gov/vuln/detail/CVE-2024-38519)] [Properly sanitize file-extension to prevent file system modification and RCE](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-79w7-vh3h-8g4j)
- Unsafe extensions are now blocked from being downloaded
#### Core changes

View File

@@ -21,7 +21,7 @@ clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.gif *.jpeg *.jpg *.lrc *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 *.mp4 \
*.mpg *.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.ssa *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
*.mpg *.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.ssa *.swf *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \
yt_dlp/extractor/lazy_extractors.py *.spec CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe yt_dlp.egg-info/ AUTHORS

View File

@@ -200,9 +200,9 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
The following provide support for impersonating browser requests. This may be required for some sites that employ TLS fingerprinting.
* [**curl_cffi**](https://github.com/yifeikong/curl_cffi) (recommended) - Python binding for [curl-impersonate](https://github.com/lwthiker/curl-impersonate). Provides impersonation targets for Chrome, Edge and Safari. Licensed under [MIT](https://github.com/yifeikong/curl_cffi/blob/main/LICENSE)
* [**curl_cffi**](https://github.com/lexiforest/curl_cffi) (recommended) - Python binding for [curl-impersonate](https://github.com/lexiforest/curl-impersonate). Provides impersonation targets for Chrome, Edge and Safari. Licensed under [MIT](https://github.com/lexiforest/curl_cffi/blob/main/LICENSE)
* Can be installed with the `curl-cffi` group, e.g. `pip install "yt-dlp[default,curl-cffi]"`
* Currently only included in `yt-dlp.exe` and `yt-dlp_macos` builds
* Currently included in `yt-dlp.exe`, `yt-dlp_linux` and `yt-dlp_macos` builds
### Metadata
@@ -368,7 +368,9 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
stderr) to apply the setting to. Can be one
of "always", "auto" (default), "never", or
"no_color" (use non color terminal
sequences). Can be used multiple times
sequences). Use "auto-tty" or "no_color-tty"
to decide based on terminal support only.
Can be used multiple times
--compat-options OPTS Options that can help keep compatibility
with youtube-dl or youtube-dlc
configurations by reverting some of the
@@ -457,17 +459,17 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
conditions. Use a "\" to escape "&" or
quotes if needed. If used multiple times,
the filter matches if at least one of the
conditions is met. E.g. --match-filter
!is_live --match-filter "like_count>?100 &
conditions is met. E.g. --match-filters
!is_live --match-filters "like_count>?100 &
description~='(?i)\bcats \& dogs\b'" matches
only videos that are not live OR those that
have a like count more than 100 (or the like
field is not available) and also has a
description that contains the phrase "cats &
dogs" (caseless). Use "--match-filter -" to
dogs" (caseless). Use "--match-filters -" to
interactively ask whether to download each
video
--no-match-filters Do not use any --match-filter (default)
--no-match-filters Do not use any --match-filters (default)
--break-match-filters FILTER Same as "--match-filters" but stops the
download process when a video is rejected
--no-break-match-filters Do not use any --break-match-filters (default)
@@ -488,7 +490,7 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
encountering a file that is in the archive
(default)
--break-per-input Alters --max-downloads, --break-on-existing,
--break-match-filter, and autonumber to
--break-match-filters, and autonumber to
reset per input URL
--no-break-per-input --break-on-existing and similar options
terminates the entire download queue
@@ -997,12 +999,16 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
be used multiple times
--no-exec Remove any previously defined --exec
--convert-subs FORMAT Convert the subtitles to another format
(currently supported: ass, lrc, srt, vtt)
(Alias: --convert-subtitles)
(currently supported: ass, lrc, srt, vtt).
Use "--convert-subs none" to disable
conversion (default) (Alias: --convert-
subtitles)
--convert-thumbnails FORMAT Convert the thumbnails to another format
(currently supported: jpg, png, webp). You
can specify multiple rules using similar
syntax as --remux-video
syntax as "--remux-video". Use "--convert-
thumbnails none" to disable conversion
(default)
--split-chapters Split video into multiple files based on
internal chapters. The "chapter:" prefix can
be used with "--paths" and "--output" to set
@@ -1756,7 +1762,7 @@ $ yt-dlp --replace-in-metadata "title,uploader" "[ _]" "-"
# EXTRACTOR ARGUMENTS
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=android_embedded,web;formats=incomplete" --extractor-args "funimation:version=uncut"`
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=mediaconnect,web;formats=incomplete" --extractor-args "funimation:version=uncut"`
Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` becomes `youtube:player_client"`
@@ -1765,7 +1771,7 @@ The following extractors use this feature:
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mediaconnect`, `mweb`, `mweb_embedscreen` and `tv_embedded` (agegate bypass) with no variants. By default, `ios,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. The `android` clients will always be given lowest priority since their formats are broken. You can use `all` to use all the clients, and `default` for the default clients.
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mediaconnect`, `mweb`, `android_producer`, `android_testsuite`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,web_creator` is used, and `tv_embedded`, `web_creator` and `mediaconnect` are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. Most `android` clients will be given lowest priority since their formats are broken. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
@@ -1773,8 +1779,11 @@ The following extractors use this feature:
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8)
* `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
* `innertube_key`: Innertube API key to use for all API requests
* `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use for requesting video playback. Comma seperated list of PO Tokens in the format `CLIENT+PO_TOKEN`, e.g. `youtube:po_token=web+XXX,android+YYY`
#### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
@@ -1859,6 +1868,9 @@ The following extractors use this feature:
#### bilibili
* `prefer_multi_flv`: Prefer extracting flv formats over mp4 for older videos that still provide legacy formats
#### digitalconcerthall
* `prefer_combined_hls`: Prefer extracting combined/pre-merged video and audio HLS formats. This will exclude 4K/HEVC video and lossless/FLAC audio formats, which are only available as split video/audio HLS formats
**Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
@@ -2172,9 +2184,9 @@ with yt_dlp.YoutubeDL(ydl_opts) as ydl:
* **Output template improvements**: Output templates can now have date-time formatting, numeric offsets, object traversal etc. See [output template](#output-template) for details. Even more advanced operations can also be done with the help of `--parse-metadata` and `--replace-in-metadata`
* **Other new options**: Many new options have been added such as `--alias`, `--print`, `--concat-playlist`, `--wait-for-video`, `--retry-sleep`, `--sleep-requests`, `--convert-thumbnails`, `--force-download-archive`, `--force-overwrites`, `--break-match-filter` etc
* **Other new options**: Many new options have been added such as `--alias`, `--print`, `--concat-playlist`, `--wait-for-video`, `--retry-sleep`, `--sleep-requests`, `--convert-thumbnails`, `--force-download-archive`, `--force-overwrites`, `--break-match-filters` etc
* **Improvements**: Regex and other operators in `--format`/`--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Improvements**: Regex and other operators in `--format`/`--match-filters`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details
@@ -2215,16 +2227,17 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* `certifi` will be used for SSL root certificates, if installed. If you want to use system certificates (e.g. self-signed), use `--compat-options no-certifi`
* yt-dlp's sanitization of invalid characters in filenames is different/smarter than in youtube-dl. You can use `--compat-options filename-sanitization` to revert to youtube-dl's behavior
* ~~yt-dlp tries to parse the external downloader outputs into the standard progress output if possible (Currently implemented: [aria2c](https://github.com/yt-dlp/yt-dlp/issues/5931)). You can use `--compat-options no-external-downloader-progress` to get the downloader output as-is~~
* yt-dlp versions between 2021.09.01 and 2023.01.02 applies `--match-filter` to nested playlists. This was an unintentional side-effect of [8f18ac](https://github.com/yt-dlp/yt-dlp/commit/8f18aca8717bb0dd49054555af8d386e5eda3a88) and is fixed in [d7b460](https://github.com/yt-dlp/yt-dlp/commit/d7b460d0e5fc710950582baed2e3fc616ed98a80). Use `--compat-options playlist-match-filter` to revert this
* yt-dlp versions between 2021.09.01 and 2023.01.02 applies `--match-filters` to nested playlists. This was an unintentional side-effect of [8f18ac](https://github.com/yt-dlp/yt-dlp/commit/8f18aca8717bb0dd49054555af8d386e5eda3a88) and is fixed in [d7b460](https://github.com/yt-dlp/yt-dlp/commit/d7b460d0e5fc710950582baed2e3fc616ed98a80). Use `--compat-options playlist-match-filter` to revert this
* yt-dlp versions between 2021.11.10 and 2023.06.21 estimated `filesize_approx` values for fragmented/manifest formats. This was added for convenience in [f2fe69](https://github.com/yt-dlp/yt-dlp/commit/f2fe69c7b0d208bdb1f6292b4ae92bc1e1a7444a), but was reverted in [0dff8e](https://github.com/yt-dlp/yt-dlp/commit/0dff8e4d1e6e9fb938f4256ea9af7d81f42fd54f) due to the potentially extreme inaccuracy of the estimated values. Use `--compat-options manifest-filesize-approx` to keep extracting the estimated values
* yt-dlp uses modern http client backends such as `requests`. Use `--compat-options prefer-legacy-http-handler` to prefer the legacy http handler (`urllib`) to be used for standard http requests.
* The sub-modules `swfinterp`, `casefold` are removed.
* Passing `--simulate` (or calling `extract_info` with `download=False`) no longer alters the default format selection. See [#9843](https://github.com/yt-dlp/yt-dlp/issues/9843) for details.
For ease of use, a few more compat options are available:
* `--compat-options all`: Use all compat options (Do NOT use)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx`
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Currently does nothing. Use this to enable all future compat options
@@ -2260,11 +2273,11 @@ While these options are redundant, they are still expected to be used due to the
--get-thumbnail --print thumbnail
-e, --get-title --print title
-g, --get-url --print urls
--match-title REGEX --match-filter "title ~= (?i)REGEX"
--reject-title REGEX --match-filter "title !~= (?i)REGEX"
--min-views COUNT --match-filter "view_count >=? COUNT"
--max-views COUNT --match-filter "view_count <=? COUNT"
--break-on-reject Use --break-match-filter
--match-title REGEX --match-filters "title ~= (?i)REGEX"
--reject-title REGEX --match-filters "title !~= (?i)REGEX"
--min-views COUNT --match-filters "view_count >=? COUNT"
--max-views COUNT --match-filters "view_count <=? COUNT"
--break-on-reject Use --break-match-filters
--user-agent UA --add-header "User-Agent:UA"
--referer URL --add-header "Referer:URL"
--playlist-start NUMBER -I NUMBER:

View File

@@ -2,7 +2,7 @@
set -e
source ~/.local/share/pipx/venvs/pyinstaller/bin/activate
python -m devscripts.install_deps --include secretstorage
python -m devscripts.install_deps --include secretstorage --include curl-cffi
python -m devscripts.make_lazy_extractors
python devscripts/update-version.py -c "${channel}" -r "${origin}" "${version}"
python -m bundle.pyinstaller

View File

@@ -179,6 +179,16 @@
{
"action": "add",
"when": "6aaf96a3d6e7d0d426e97e11a2fcf52fda00e733",
"short": "[priority] Security: [[CVE-2024-10123](https://nvd.nist.gov/vuln/detail/CVE-2024-10123)] [Properly sanitize file-extension to prevent file system modification and RCE](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-79w7-vh3h-8g4j)\n - Unsafe extensions are now blocked from being downloaded"
"short": "[priority] Security: [[CVE-2024-38519](https://nvd.nist.gov/vuln/detail/CVE-2024-38519)] [Properly sanitize file-extension to prevent file system modification and RCE](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-79w7-vh3h-8g4j)\n - Unsafe extensions are now blocked from being downloaded"
},
{
"action": "add",
"when": "6075a029dba70a89675ae1250e7cdfd91f0eba41",
"short": "[priority] Security: [[ie/douyutv] Do not use dangerous javascript source/URL](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-3v33-3wmw-3785)\n - A dependency on potentially malicious third-party JavaScript code has been removed from the Douyu extractors"
},
{
"action": "add",
"when": "fb8b7f226d251e521a89b23c415e249e5b788e5c",
"short": "[priority] **The minimum *recommended* Python version has been raised to 3.9**\nSince Python 3.8 will reach end-of-life in October 2024, support for it will be dropped soon. [Read more](https://github.com/yt-dlp/yt-dlp/issues/10086)"
}
]

View File

@@ -46,6 +46,14 @@ VERBOSE_TMPL = '''
render: shell
validations:
required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
'''.strip()
NO_SKIP = '''

View File

@@ -9,6 +9,7 @@ maintainers = [
{name = "Grub4K", email = "contact@grub4k.xyz"},
{name = "bashonly", email = "bashonly@protonmail.com"},
{name = "coletdjnz", email = "coletdjnz@protonmail.com"},
{name = "sepro", email = "sepro@sepr0.com"},
]
description = "A feature-rich command-line audio/video downloader"
readme = "README.md"
@@ -48,12 +49,15 @@ dependencies = [
"pycryptodomex",
"requests>=2.32.2,<3",
"urllib3>=1.26.17,<3",
"websockets>=12.0",
"websockets>=13.0",
]
[project.optional-dependencies]
default = []
curl-cffi = ["curl-cffi==0.5.10; implementation_name=='cpython'"]
curl-cffi = [
"curl-cffi==0.5.10; os_name=='nt' and implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,<0.7.2; os_name!='nt' and implementation_name=='cpython'",
]
secretstorage = [
"cffi",
"secretstorage",
@@ -62,7 +66,7 @@ build = [
"build",
"hatchling",
"pip",
"setuptools",
"setuptools>=71.0.2", # 71.0.0 broke pyinstaller
"wheel",
]
dev = [
@@ -72,13 +76,13 @@ dev = [
]
static-analysis = [
"autopep8~=2.0",
"ruff~=0.5.0",
"ruff~=0.6.0",
]
test = [
"pytest~=8.1",
]
pyinstaller = [
"pyinstaller>=6.7.0", # for compat with setuptools>=70
"pyinstaller>=6.10.0", # Windows temp cleanup fixed in 6.10.0
]
py2exe = [
"py2exe>=0.12",

View File

@@ -143,6 +143,7 @@
- **BBVTV**: [*bbvtv*](## "netrc machine")
- **BBVTVLive**: [*bbvtv*](## "netrc machine")
- **BBVTVRecordings**: [*bbvtv*](## "netrc machine")
- **BeaconTv**
- **BeatBumpPlaylist**
- **BeatBumpVideo**
- **Beatport**
@@ -354,7 +355,6 @@
- **DigitallySpeaking**
- **Digiteka**
- **DiscogsReleasePlaylist**
- **Discovery**
- **DiscoveryLife**
- **DiscoveryNetworksDe**
- **DiscoveryPlus**
@@ -363,7 +363,6 @@
- **DiscoveryPlusItaly**
- **DiscoveryPlusItalyShow**
- **Disney**
- **DIYNetwork**
- **dlf**
- **dlf:corpus**: DLF Multi-feed Archives
- **dlive:stream**
@@ -507,6 +506,7 @@
- **gem.cbc.ca:playlist**
- **Genius**
- **GeniusLyrics**
- **Germanupa**: germanupa.de
- **GetCourseRu**: [*getcourseru*](## "netrc machine")
- **GetCourseRuPlayer**
- **Gettr**
@@ -516,7 +516,6 @@
- **GlattvisionTVLive**: [*glattvisiontv*](## "netrc machine")
- **GlattvisionTVRecordings**: [*glattvisiontv*](## "netrc machine")
- **Glide**: Glide mobile video messages (glide.me)
- **GlobalCyclingNetworkPlus**
- **GlobalPlayerAudio**
- **GlobalPlayerAudioEpisode**
- **GlobalPlayerLive**
@@ -583,6 +582,7 @@
- **HungamaAlbumPlaylist**
- **HungamaSong**
- **huya:live**: huya.com
- **huya:video**: 虎牙视频
- **Hypem**
- **Hytale**
- **Icareus**
@@ -658,10 +658,12 @@
- **Ketnet**
- **khanacademy**
- **khanacademy:unit**
- **Kick**
- **kick:clips**
- **kick:live**
- **kick:vod**
- **Kicker**
- **KickStarter**
- **KickVOD**
- **Kika**: KiKA.de
- **kinja:embed**
- **KinoPoisk**
- **Kommunetv**
@@ -693,6 +695,7 @@
- **Lcp**
- **LcpPlay**
- **Le**: 乐视网
- **LearningOnScreen**
- **Lecture2Go**: (**Currently broken**)
- **Lecturio**: [*lecturio*](## "netrc machine")
- **LecturioCourse**: [*lecturio*](## "netrc machine")
@@ -723,7 +726,6 @@
- **livestream:original**
- **Livestreamfails**
- **Lnk**
- **LnkGo**
- **loc**: Library of Congress
- **loom**
- **loom:folder**
@@ -757,7 +759,7 @@
- **Masters**
- **MatchTV**
- **MBN**: mbn.co.kr (매일방송)
- **MDR**: MDR.DE and KiKA
- **MDR**: MDR.DE
- **MedalTV**
- **media.ccc.de**
- **media.ccc.de:lists**
@@ -812,6 +814,7 @@
- **MNetTVLive**: [*mnettv*](## "netrc machine")
- **MNetTVRecordings**: [*mnettv*](## "netrc machine")
- **MochaVideo**
- **Mojevideo**: mojevideo.sk
- **Mojvideo**
- **Monstercat**
- **MonsterSirenHypergryphMusic**
@@ -820,8 +823,6 @@
- **MotherlessGroup**
- **MotherlessUploader**
- **Motorsport**: motorsport.com (**Currently broken**)
- **MotorTrend**
- **MotorTrendOnDemand**
- **MovieFap**
- **Moviepilot**
- **MoviewPlay**
@@ -839,7 +840,7 @@
- **MTVUutisetArticle**: (**Currently broken**)
- **MuenchenTV**: münchen.tv (**Currently broken**)
- **MujRozhlas**
- **Murrtube**: (**Currently broken**)
- **Murrtube**
- **MurrtubeUser**: Murrtube user profile (**Currently broken**)
- **MuseAI**
- **MuseScore**
@@ -1145,7 +1146,6 @@
- **QuantumTV**: [*quantumtv*](## "netrc machine")
- **QuantumTVLive**: [*quantumtv*](## "netrc machine")
- **QuantumTVRecordings**: [*quantumtv*](## "netrc machine")
- **Qub**
- **R7**: (**Currently broken**)
- **R7Article**: (**Currently broken**)
- **Radiko**
@@ -1289,12 +1289,14 @@
- **Screencast**
- **Screencastify**
- **ScreencastOMatic**
- **ScreenRec**
- **ScrippsNetworks**
- **scrippsnetworks:watch**
- **Scrolller**
- **SCTE**: [*scte*](## "netrc machine") (**Currently broken**)
- **SCTECourse**: [*scte*](## "netrc machine") (**Currently broken**)
- **sejm**
- **Sen**
- **SenalColombiaLive**: (**Currently broken**)
- **SenateGov**
- **SenateISVP**
@@ -1331,6 +1333,7 @@
- **SlidesLive**
- **Slutload**
- **Smotrim**
- **SnapchatSpotlight**
- **Snotr**
- **Sohu**
- **SohuV**
@@ -1522,9 +1525,9 @@
- **tv5unis**
- **tv5unis:video**
- **tv8.it**
- **TVA**
- **TVANouvelles**
- **TVANouvellesArticle**
- **tvaplus**: TVA+
- **TVC**
- **TVCArticle**
- **TVer**
@@ -1612,12 +1615,14 @@
- **videomore:season**
- **videomore:video**
- **VideoPress**
- **Vidflex**
- **Vidio**: [*vidio*](## "netrc machine")
- **VidioLive**: [*vidio*](## "netrc machine")
- **VidioPremier**: [*vidio*](## "netrc machine")
- **VidLii**
- **Vidly**
- **vids.io**
- **Vidyard**
- **viewlift**
- **viewlift:embed**
- **Viidea**
@@ -1665,6 +1670,8 @@
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX
- **VTM**: (**Currently broken**)
- **VTV**
- **VTVGo**
- **VTXTV**: [*vtxtv*](## "netrc machine")
- **VTXTVLive**: [*vtxtv*](## "netrc machine")
- **VTXTVRecordings**: [*vtxtv*](## "netrc machine")
@@ -1737,7 +1744,7 @@
- **XiaoHongShu**: 小红书
- **ximalaya**: 喜马拉雅FM
- **ximalaya:album**: 喜马拉雅FM 专辑
- **xinpianchang**: xinpianchang.com (**Currently broken**)
- **Xinpianchang**: 新片场
- **XMinus**: (**Currently broken**)
- **XNXX**
- **Xstream**

View File

@@ -4,6 +4,7 @@
import os
import sys
import unittest
from unittest.mock import patch
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@@ -235,6 +236,35 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot')
def test_format_selection_by_vcodec_sort(self):
formats = [
{'format_id': 'av1-format', 'ext': 'mp4', 'vcodec': 'av1', 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'vp9-hdr-format', 'ext': 'mp4', 'vcodec': 'vp09.02.50.10.01.09.18.09.00', 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'vp9-sdr-format', 'ext': 'mp4', 'vcodec': 'vp09.00.50.08', 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'h265-format', 'ext': 'mp4', 'vcodec': 'h265', 'acodec': 'none', 'url': TEST_URL},
]
info_dict = _make_result(formats)
ydl = YDL({'format': 'bestvideo', 'format_sort': ['vcodec:vp9.2']})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vp9-hdr-format')
ydl = YDL({'format': 'bestvideo', 'format_sort': ['vcodec:vp9']})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vp9-sdr-format')
ydl = YDL({'format': 'bestvideo', 'format_sort': ['+vcodec:vp9.2']})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vp9-hdr-format')
ydl = YDL({'format': 'bestvideo', 'format_sort': ['+vcodec:vp9']})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vp9-sdr-format')
def test_format_selection_string_ops(self):
formats = [
{'format_id': 'abc-cba', 'ext': 'mp4', 'url': TEST_URL},
@@ -520,7 +550,33 @@ class TestFormatSelection(unittest.TestCase):
ydl.process_ie_result(info_dict)
self.assertEqual(ydl.downloaded_info_dicts, [])
def test_default_format_spec(self):
@patch('yt_dlp.postprocessor.ffmpeg.FFmpegMergerPP.available', False)
def test_default_format_spec_without_ffmpeg(self):
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({}), 'best/bestvideo+bestaudio')
ydl = YDL({'simulate': True})
self.assertEqual(ydl._default_format_spec({}), 'best/bestvideo+bestaudio')
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')
ydl = YDL({'simulate': True})
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')
ydl = YDL({'outtmpl': '-'})
self.assertEqual(ydl._default_format_spec({}), 'best/bestvideo+bestaudio')
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({}), 'best/bestvideo+bestaudio')
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')
@patch('yt_dlp.postprocessor.ffmpeg.FFmpegMergerPP.available', True)
@patch('yt_dlp.postprocessor.ffmpeg.FFmpegMergerPP.can_merge', lambda _: True)
def test_default_format_spec_with_ffmpeg(self):
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({}), 'bestvideo*+bestaudio/best')
ydl = YDL({'simulate': True})
self.assertEqual(ydl._default_format_spec({}), 'bestvideo*+bestaudio/best')
@@ -528,13 +584,13 @@ class TestFormatSelection(unittest.TestCase):
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')
ydl = YDL({'simulate': True})
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'bestvideo*+bestaudio/best')
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')
ydl = YDL({'outtmpl': '-'})
self.assertEqual(ydl._default_format_spec({}), 'best/bestvideo+bestaudio')
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({}, download=False), 'bestvideo*+bestaudio/best')
self.assertEqual(ydl._default_format_spec({}), 'bestvideo*+bestaudio/best')
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best/bestvideo+bestaudio')

View File

@@ -376,6 +376,61 @@ class TestJSInterpreter(unittest.TestCase):
jsi = JSInterpreter('''function f(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''')
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|')))
def test_join(self):
test_input = list('test')
tests = [
'function f(a, b){return a.join(b)}',
'function f(a, b){return Array.prototype.join.call(a, b)}',
'function f(a, b){return Array.prototype.join.apply(a, [b])}',
]
for test in tests:
jsi = JSInterpreter(test)
self._test(jsi, 'test', args=[test_input, ''])
self._test(jsi, 't-e-s-t', args=[test_input, '-'])
self._test(jsi, '', args=[[], '-'])
def test_split(self):
test_result = list('test')
tests = [
'function f(a, b){return a.split(b)}',
'function f(a, b){return String.prototype.split.call(a, b)}',
'function f(a, b){return String.prototype.split.apply(a, [b])}',
]
for test in tests:
jsi = JSInterpreter(test)
self._test(jsi, test_result, args=['test', ''])
self._test(jsi, test_result, args=['t-e-s-t', '-'])
self._test(jsi, [''], args=['', '-'])
self._test(jsi, [], args=['', ''])
def test_slice(self):
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice()}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(5)}', [5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(99)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-2)}', [7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-99)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 1)}', [0])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(3, 6)}', [3, 4, 5])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, -1)}', [1, 2, 3, 4, 5, 6, 7])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-1, 1)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-3, -1)}', [6, 7])
self._test('function f(){return "012345678".slice()}', '012345678')
self._test('function f(){return "012345678".slice(0)}', '012345678')
self._test('function f(){return "012345678".slice(5)}', '5678')
self._test('function f(){return "012345678".slice(99)}', '')
self._test('function f(){return "012345678".slice(-2)}', '78')
self._test('function f(){return "012345678".slice(-99)}', '012345678')
self._test('function f(){return "012345678".slice(0, 0)}', '')
self._test('function f(){return "012345678".slice(1, 0)}', '')
self._test('function f(){return "012345678".slice(0, 1)}', '0')
self._test('function f(){return "012345678".slice(3, 6)}', '345')
self._test('function f(){return "012345678".slice(1, -1)}', '1234567')
self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
if __name__ == '__main__':
unittest.main()

View File

@@ -265,6 +265,11 @@ class HTTPTestRequestHandler(http.server.BaseHTTPRequestHandler):
self.end_headers()
self.wfile.write(payload)
self.finish()
elif self.path == '/get_cookie':
self.send_response(200)
self.send_header('Set-Cookie', 'test=ytdlp; path=/')
self.end_headers()
self.finish()
else:
self._status(404)
@@ -338,6 +343,52 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers'))
assert not issubclass(exc_info.type, CertificateVerifyError)
@pytest.mark.skip_handler('CurlCFFI', 'legacy_ssl ignored by CurlCFFI')
def test_legacy_ssl_extension(self, handler):
# HTTPS server with old ciphers
# XXX: is there a better way to test this than to create a new server?
https_httpd = http.server.ThreadingHTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler)
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.maximum_version = ssl.TLSVersion.TLSv1_2
sslctx.set_ciphers('SHA1:AESCCM:aDSS:eNULL:aNULL')
sslctx.load_cert_chain(os.path.join(TEST_DIR, 'testcert.pem'), None)
https_httpd.socket = sslctx.wrap_socket(https_httpd.socket, server_side=True)
https_port = http_server_port(https_httpd)
https_server_thread = threading.Thread(target=https_httpd.serve_forever)
https_server_thread.daemon = True
https_server_thread.start()
with handler(verify=False) as rh:
res = validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers', extensions={'legacy_ssl': True}))
assert res.status == 200
res.close()
# Ensure only applies to request extension
with pytest.raises(SSLError):
validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers'))
@pytest.mark.skip_handler('CurlCFFI', 'legacy_ssl ignored by CurlCFFI')
def test_legacy_ssl_support(self, handler):
# HTTPS server with old ciphers
# XXX: is there a better way to test this than to create a new server?
https_httpd = http.server.ThreadingHTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler)
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.maximum_version = ssl.TLSVersion.TLSv1_2
sslctx.set_ciphers('SHA1:AESCCM:aDSS:eNULL:aNULL')
sslctx.load_cert_chain(os.path.join(TEST_DIR, 'testcert.pem'), None)
https_httpd.socket = sslctx.wrap_socket(https_httpd.socket, server_side=True)
https_port = http_server_port(https_httpd)
https_server_thread = threading.Thread(target=https_httpd.serve_forever)
https_server_thread.daemon = True
https_server_thread.start()
with handler(verify=False, legacy_ssl_support=True) as rh:
res = validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers'))
assert res.status == 200
res.close()
def test_percent_encode(self, handler):
with handler() as rh:
# Unicode characters should be encoded with uppercase percent-encoding
@@ -490,6 +541,24 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
rh, Request(f'http://127.0.0.1:{self.http_port}/headers', extensions={'cookiejar': cookiejar})).read()
assert b'cookie: test=ytdlp' in data.lower()
def test_cookie_sync_only_cookiejar(self, handler):
# Ensure that cookies are ONLY being handled by the cookiejar
with handler() as rh:
validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
data = validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}/headers', extensions={'cookiejar': YoutubeDLCookieJar()})).read()
assert b'cookie: test=ytdlp' not in data.lower()
def test_cookie_sync_delete_cookie(self, handler):
# Ensure that cookies are ONLY being handled by the cookiejar
cookiejar = YoutubeDLCookieJar()
with handler(cookiejar=cookiejar) as rh:
validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}/get_cookie'))
data = validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}/headers')).read()
assert b'cookie: test=ytdlp' in data.lower()
cookiejar.clear_session_cookies()
data = validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}/headers')).read()
assert b'cookie: test=ytdlp' not in data.lower()
def test_headers(self, handler):
with handler(headers=HTTPHeaderDict({'test1': 'test', 'test2': 'test2'})) as rh:
@@ -753,6 +822,24 @@ class TestRequestHandlerMisc:
rh.close()
assert len(logging_handlers) == before_count
def test_wrap_request_errors(self):
class TestRequestHandler(RequestHandler):
def _validate(self, request):
if request.headers.get('x-fail'):
raise UnsupportedRequest('test error')
def _send(self, request: Request):
raise RequestError('test error')
with TestRequestHandler(logger=FakeLogger()) as rh:
with pytest.raises(UnsupportedRequest, match='test error') as exc_info:
rh.validate(Request('http://example.com', headers={'x-fail': '1'}))
assert exc_info.value.handler is rh
with pytest.raises(RequestError, match='test error') as exc_info:
rh.send(Request('http://example.com'))
assert exc_info.value.handler is rh
@pytest.mark.parametrize('handler', ['Urllib'], indirect=True)
class TestUrllibRequestHandler(TestRequestHandlerBase):
@@ -914,7 +1001,6 @@ class TestRequestsRequestHandler(TestRequestHandlerBase):
class TestCurlCFFIRequestHandler(TestRequestHandlerBase):
@pytest.mark.parametrize('params,extensions', [
({}, {'impersonate': ImpersonateTarget('chrome')}),
({'impersonate': ImpersonateTarget('chrome', '110')}, {}),
({'impersonate': ImpersonateTarget('chrome', '99')}, {'impersonate': ImpersonateTarget('chrome', '110')}),
])
@@ -1200,6 +1286,9 @@ class TestRequestHandlerValidation:
({'timeout': 1}, False),
({'timeout': 'notatimeout'}, AssertionError),
({'unsupported': 'value'}, UnsupportedRequest),
({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError),
]),
('Requests', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError),
@@ -1207,6 +1296,9 @@ class TestRequestHandlerValidation:
({'timeout': 1}, False),
({'timeout': 'notatimeout'}, AssertionError),
({'unsupported': 'value'}, UnsupportedRequest),
({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError),
]),
('CurlCFFI', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError),
@@ -1220,6 +1312,9 @@ class TestRequestHandlerValidation:
({'impersonate': ImpersonateTarget(None, None, None, None)}, False),
({'impersonate': ImpersonateTarget()}, False),
({'impersonate': 'chrome'}, AssertionError),
({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError),
]),
(NoCheckRH, 'http', [
({'cookiejar': 'notacookiejar'}, False),
@@ -1228,6 +1323,9 @@ class TestRequestHandlerValidation:
('Websockets', 'ws', [
({'cookiejar': YoutubeDLCookieJar()}, False),
({'timeout': 2}, False),
({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError),
]),
]

View File

@@ -444,6 +444,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363)
self.assertEqual(unified_timestamp('Sunday, 26 Nov 2006, 19:00'), 1164567600)
self.assertEqual(unified_timestamp('wed, aug 16, 2008, 12:00pm'), 1218931200)
self.assertEqual(unified_timestamp('December 31 1969 20:00:01 EDT'), 1)
self.assertEqual(unified_timestamp('Wednesday 31 December 1969 18:01:26 MDT'), 86)
@@ -919,6 +921,11 @@ class TestUtil(unittest.TestCase):
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('vp09.02.50.10.01.09.18.09.00'), {
'vcodec': 'vp09.02.50.10.01.09.18.09.00',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('av01.0.12M.10.0.110.09.16.09.0'), {
'vcodec': 'av01.0.12M.10.0.110.09.16.09.0',
'acodec': 'none',
@@ -929,6 +936,11 @@ class TestUtil(unittest.TestCase):
'acodec': 'none',
'dynamic_range': 'DV',
})
self.assertEqual(parse_codecs('fLaC'), {
'vcodec': 'none',
'acodec': 'flac',
'dynamic_range': None,
})
self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora',
'acodec': 'vorbis',

View File

@@ -61,6 +61,10 @@ def process_request(self, request):
return websockets.http11.Response(
status.value, status.phrase, websockets.datastructures.Headers([('Location', '/')]), b'')
return self.protocol.reject(status.value, status.phrase)
elif request.path.startswith('/get_cookie'):
response = self.protocol.accept(request)
response.headers['Set-Cookie'] = 'test=ytdlp'
return response
return self.protocol.accept(request)
@@ -84,7 +88,7 @@ def create_wss_websocket_server():
certfn = os.path.join(TEST_DIR, 'testcert.pem')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.load_cert_chain(certfn, None)
return create_websocket_server(ssl_context=sslctx)
return create_websocket_server(ssl=sslctx)
MTLS_CERT_DIR = os.path.join(TEST_DIR, 'testdata', 'certificate')
@@ -99,7 +103,16 @@ def create_mtls_wss_websocket_server():
sslctx.load_verify_locations(cafile=cacertfn)
sslctx.load_cert_chain(certfn, None)
return create_websocket_server(ssl_context=sslctx)
return create_websocket_server(ssl=sslctx)
def create_legacy_wss_websocket_server():
certfn = os.path.join(TEST_DIR, 'testcert.pem')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.maximum_version = ssl.TLSVersion.TLSv1_2
sslctx.set_ciphers('SHA1:AESCCM:aDSS:eNULL:aNULL')
sslctx.load_cert_chain(certfn, None)
return create_websocket_server(ssl=sslctx)
def ws_validate_and_send(rh, req):
@@ -126,12 +139,15 @@ class TestWebsSocketRequestHandlerConformance:
cls.wss_thread, cls.wss_port = create_wss_websocket_server()
cls.wss_base_url = f'wss://127.0.0.1:{cls.wss_port}'
cls.bad_wss_thread, cls.bad_wss_port = create_websocket_server(ssl_context=ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER))
cls.bad_wss_thread, cls.bad_wss_port = create_websocket_server(ssl=ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER))
cls.bad_wss_host = f'wss://127.0.0.1:{cls.bad_wss_port}'
cls.mtls_wss_thread, cls.mtls_wss_port = create_mtls_wss_websocket_server()
cls.mtls_wss_base_url = f'wss://127.0.0.1:{cls.mtls_wss_port}'
cls.legacy_wss_thread, cls.legacy_wss_port = create_legacy_wss_websocket_server()
cls.legacy_wss_host = f'wss://127.0.0.1:{cls.legacy_wss_port}'
def test_basic_websockets(self, handler):
with handler() as rh:
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
@@ -166,6 +182,22 @@ class TestWebsSocketRequestHandlerConformance:
ws_validate_and_send(rh, Request(self.bad_wss_host))
assert not issubclass(exc_info.type, CertificateVerifyError)
def test_legacy_ssl_extension(self, handler):
with handler(verify=False) as rh:
ws = ws_validate_and_send(rh, Request(self.legacy_wss_host, extensions={'legacy_ssl': True}))
assert ws.status == 101
ws.close()
# Ensure only applies to request extension
with pytest.raises(SSLError):
ws_validate_and_send(rh, Request(self.legacy_wss_host))
def test_legacy_ssl_support(self, handler):
with handler(verify=False, legacy_ssl_support=True) as rh:
ws = ws_validate_and_send(rh, Request(self.legacy_wss_host))
assert ws.status == 101
ws.close()
@pytest.mark.parametrize('path,expected', [
# Unicode characters should be encoded with uppercase percent-encoding
('/中文', '/%E4%B8%AD%E6%96%87'),
@@ -248,6 +280,32 @@ class TestWebsSocketRequestHandlerConformance:
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
ws.close()
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
def test_cookie_sync_only_cookiejar(self, handler):
# Ensure that cookies are ONLY being handled by the cookiejar
with handler() as rh:
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()}))
ws.send('headers')
assert 'cookie' not in json.loads(ws.recv())
ws.close()
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
def test_cookie_sync_delete_cookie(self, handler):
# Ensure that cookies are ONLY being handled by the cookiejar
cookiejar = YoutubeDLCookieJar()
with handler(verbose=True, cookiejar=cookiejar) as rh:
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie'))
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
ws.close()
cookiejar.clear_session_cookies()
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers')
assert 'cookie' not in json.loads(ws.recv())
ws.close()
def test_source_address(self, handler):
source_address = f'127.0.0.{random.randint(5, 255)}'
verify_address_availability(source_address)

View File

@@ -167,6 +167,22 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/590f65a6/player_ias.vflset/en_US/base.js',
'1tm7-g_A9zsI8_Lay_', 'xI4Vem4Put_rOg',
),
(
'https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js',
'b6HcntHGkvBLk_FRf', 'kNPW6A7FyP2l8A',
),
(
'https://www.youtube.com/s/player/3400486c/player_ias.vflset/en_US/base.js',
'lL46g3XifCKUZn1Xfw', 'z767lhet6V2Skl',
),
(
'https://www.youtube.com/s/player/20dfca59/player_ias.vflset/en_US/base.js',
'-fLCxedkAk4LUTK2', 'O8kfRq1y1eyHGw',
),
(
'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js',
'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw',
),
]

View File

@@ -452,7 +452,8 @@ class YoutubeDL:
Can also just be a single color policy,
in which case it applies to all outputs.
Valid stream names are 'stdout' and 'stderr'.
Valid color policies are one of 'always', 'auto', 'no_color' or 'never'.
Valid color policies are one of 'always', 'auto',
'no_color', 'never', 'auto-tty' or 'no_color-tty'.
geo_bypass: Bypass geographic restriction via faking X-Forwarded-For
HTTP header
geo_bypass_country:
@@ -659,12 +660,15 @@ class YoutubeDL:
self.params['color'] = 'no_color'
term_allow_color = os.getenv('TERM', '').lower() != 'dumb'
no_color = bool(os.getenv('NO_COLOR'))
base_no_color = bool(os.getenv('NO_COLOR'))
def process_color_policy(stream):
stream_name = {sys.stdout: 'stdout', sys.stderr: 'stderr'}[stream]
policy = traverse_obj(self.params, ('color', (stream_name, None), {str}), get_all=False)
if policy in ('auto', None):
policy = traverse_obj(self.params, ('color', (stream_name, None), {str}, any)) or 'auto'
if policy in ('auto', 'auto-tty', 'no_color-tty'):
no_color = base_no_color
if policy.endswith('tty'):
no_color = policy.startswith('no_color')
if term_allow_color and supports_terminal_sequences(stream):
return 'no_color' if no_color else True
return False
@@ -2190,9 +2194,8 @@ class YoutubeDL:
or all(f.get('acodec') == 'none' for f in formats)), # OR, No formats with audio
}))
def _default_format_spec(self, info_dict, download=True):
download = download and not self.params.get('simulate')
prefer_best = download and (
def _default_format_spec(self, info_dict):
prefer_best = (
self.params['outtmpl']['default'] == '-'
or info_dict.get('is_live') and not self.params.get('live_from_start'))
@@ -2200,7 +2203,7 @@ class YoutubeDL:
merger = FFmpegMergerPP(self)
return merger.available and merger.can_merge()
if not prefer_best and download and not can_merge():
if not prefer_best and not can_merge():
prefer_best = True
formats = self._get_formats(info_dict)
evaluate_formats = lambda spec: self._select_formats(formats, self.build_format_selector(spec))
@@ -2959,7 +2962,7 @@ class YoutubeDL:
continue
if format_selector is None:
req_format = self._default_format_spec(info_dict, download=download)
req_format = self._default_format_spec(info_dict)
self.write_debug(f'Default format spec: {req_format}')
format_selector = self.build_format_selector(req_format)
@@ -3169,11 +3172,12 @@ class YoutubeDL:
if test:
verbose = self.params.get('verbose')
quiet = self.params.get('quiet') or not verbose
params = {
'test': True,
'quiet': self.params.get('quiet') or not verbose,
'quiet': quiet,
'verbose': verbose,
'noprogress': not verbose,
'noprogress': quiet,
'nopart': True,
'skip_unavailable_fragments': False,
'keep_fragments': False,

View File

@@ -235,6 +235,11 @@ def validate_options(opts):
validate_regex('format sorting', f, FormatSorter.regex)
# Postprocessor formats
if opts.convertsubtitles == 'none':
opts.convertsubtitles = None
if opts.convertthumbnails == 'none':
opts.convertthumbnails = None
validate_regex('merge output format', opts.merge_output_format,
r'({0})(/({0}))*'.format('|'.join(map(re.escape, FFmpegMergerPP.SUPPORTED_EXTS))))
validate_regex('audio format', opts.audioformat, FFmpegExtractAudioPP.FORMAT_RE)
@@ -468,7 +473,7 @@ def validate_options(opts):
default_downloader = ed.get_basename()
for policy in opts.color.values():
if policy not in ('always', 'auto', 'no_color', 'never'):
if policy not in ('always', 'auto', 'auto-tty', 'no_color', 'no_color-tty', 'never'):
raise ValueError(f'"{policy}" is not a valid color policy')
warnings, deprecation_warnings = [], []
@@ -599,7 +604,7 @@ def validate_options(opts):
warnings.append(
'Using allow-unsafe-ext opens you up to potential attacks. '
'Use with great care!')
_UnsafeExtensionError.sanitize_extension = lambda x: x
_UnsafeExtensionError.sanitize_extension = lambda x, prepend=False: x
return warnings, deprecation_warnings

View File

@@ -1053,8 +1053,9 @@ def _decrypt_windows_dpapi(ciphertext, logger):
ctypes.byref(blob_out), # pDataOut
)
if not ret:
logger.warning('failed to decrypt with DPAPI', only_once=True)
return None
message = 'Failed to decrypt with DPAPI. See https://github.com/yt-dlp/yt-dlp/issues/10927 for more info'
logger.error(message)
raise DownloadError(message) # force exit
result = ctypes.string_at(blob_out.pbData, blob_out.cbData)
ctypes.windll.kernel32.LocalFree(blob_out.pbData)

View File

@@ -508,7 +508,7 @@ class FFmpegFD(ExternalFD):
env = None
proxy = self.params.get('proxy')
if proxy:
if not re.match(r'^[\da-zA-Z]+://', proxy):
if not re.match(r'[\da-zA-Z]+://', proxy):
proxy = f'http://{proxy}'
if proxy.startswith('socks'):
@@ -559,7 +559,7 @@ class FFmpegFD(ExternalFD):
selected_formats = info_dict.get('requested_formats') or [info_dict]
for i, fmt in enumerate(selected_formats):
is_http = re.match(r'^https?://', fmt['url'])
is_http = re.match(r'https?://', fmt['url'])
cookies = self.ydl.cookiejar.get_cookies_for_url(fmt['url']) if is_http else []
if cookies:
args.extend(['-cookies', ''.join(

View File

@@ -217,6 +217,7 @@ from .bbc import (
BBCCoUkIPlayerGroupIE,
BBCCoUkPlaylistIE,
)
from .beacon import BeaconTvIE
from .beatbump import (
BeatBumpPlaylistIE,
BeatBumpVideoIE,
@@ -504,7 +505,6 @@ from .dhm import DHMIE
from .digitalconcerthall import DigitalConcertHallIE
from .digiteka import DigitekaIE
from .discogs import DiscogsReleasePlaylistIE
from .discovery import DiscoveryIE
from .disney import DisneyIE
from .dispeak import DigitallySpeakingIE
from .dlf import (
@@ -532,16 +532,12 @@ from .dplay import (
DiscoveryPlusIndiaShowIE,
DiscoveryPlusItalyIE,
DiscoveryPlusItalyShowIE,
DIYNetworkIE,
DPlayIE,
FoodNetworkIE,
GlobalCyclingNetworkPlusIE,
GoDiscoveryIE,
HGTVDeIE,
HGTVUsaIE,
InvestigationDiscoveryIE,
MotorTrendIE,
MotorTrendOnDemandIE,
ScienceChannelIE,
TravelChannelIE,
)
@@ -734,6 +730,7 @@ from .genius import (
GeniusIE,
GeniusLyricsIE,
)
from .germanupa import GermanupaIE
from .getcourseru import (
GetCourseRuIE,
GetCourseRuPlayerIE,
@@ -827,7 +824,10 @@ from .hungama import (
HungamaIE,
HungamaSongIE,
)
from .huya import HuyaLiveIE
from .huya import (
HuyaLiveIE,
HuyaVideoIE,
)
from .hypem import HypemIE
from .hypergryph import MonsterSirenHypergryphMusicIE
from .hytale import HytaleIE
@@ -944,11 +944,13 @@ from .khanacademy import (
KhanAcademyUnitIE,
)
from .kick import (
KickClipIE,
KickIE,
KickVODIE,
)
from .kicker import KickerIE
from .kickstarter import KickStarterIE
from .kika import KikaIE
from .kinja import KinjaEmbedIE
from .kinopoisk import KinoPoiskIE
from .kommunetv import KommunetvIE
@@ -991,6 +993,7 @@ from .lcp import (
LcpIE,
LcpPlayIE,
)
from .learningonscreen import LearningOnScreenIE
from .lecture2go import Lecture2GoIE
from .lecturio import (
LecturioCourseIE,
@@ -1039,10 +1042,7 @@ from .livestream import (
LivestreamShortenerIE,
)
from .livestreamfails import LivestreamfailsIE
from .lnkgo import (
LnkGoIE,
LnkIE,
)
from .lnk import LnkIE
from .loom import (
LoomFolderIE,
LoomIE,
@@ -1167,6 +1167,7 @@ from .mlb import (
)
from .mlssoccer import MLSSoccerIE
from .mocha import MochaVideoIE
from .mojevideo import MojevideoIE
from .mojvideo import MojvideoIE
from .monstercat import MonstercatIE
from .motherless import (
@@ -1813,6 +1814,7 @@ from .screen9 import Screen9IE
from .screencast import ScreencastIE
from .screencastify import ScreencastifyIE
from .screencastomatic import ScreencastOMaticIE
from .screenrec import ScreenRecIE
from .scrippsnetworks import (
ScrippsNetworksIE,
ScrippsNetworksWatchIE,
@@ -1823,6 +1825,7 @@ from .scte import (
SCTECourseIE,
)
from .sejmpl import SejmIE
from .sen import SenIE
from .senalcolombia import SenalColombiaLiveIE
from .senategov import (
SenateGovIE,
@@ -1878,6 +1881,7 @@ from .slideshare import SlideshareIE
from .slideslive import SlidesLiveIE
from .slutload import SlutloadIE
from .smotrim import SmotrimIE
from .snapchat import SnapchatSpotlightIE
from .snotr import SnotrIE
from .sohu import (
SohuIE,
@@ -2174,10 +2178,7 @@ from .tv5unis import (
TV5UnisVideoIE,
)
from .tv24ua import TV24UAVideoIE
from .tva import (
TVAIE,
QubIE,
)
from .tva import TVAIE
from .tvanouvelles import (
TVANouvellesArticleIE,
TVANouvellesIE,
@@ -2317,6 +2318,7 @@ from .videomore import (
VideomoreVideoIE,
)
from .videopress import VideoPressIE
from .vidflex import VidflexIE
from .vidio import (
VidioIE,
VidioLiveIE,
@@ -2324,6 +2326,7 @@ from .vidio import (
)
from .vidlii import VidLiiIE
from .vidly import VidlyIE
from .vidyard import VidyardIE
from .viewlift import (
ViewLiftEmbedIE,
ViewLiftIE,
@@ -2389,6 +2392,10 @@ from .vrt import (
VrtNUIE,
)
from .vtm import VTMIE
from .vtv import (
VTVIE,
VTVGoIE,
)
from .vuclip import VuClipIE
from .vvvvid import (
VVVVIDIE,

View File

@@ -387,17 +387,27 @@ class ABCIViewShowSeriesIE(InfoExtractor):
'thumbnail': r're:^https?://cdn\.iview\.abc\.net\.au/thumbs/.*\.jpg$',
},
'playlist_count': 15,
'skip': 'This program is not currently available in ABC iview',
}, {
'url': 'https://iview.abc.net.au/show/inbestigators',
'info_dict': {
'id': '175343-1',
'title': 'Series 1',
'description': 'md5:b9976935a6450e5b78ce2a940a755685',
'series': 'The Inbestigators',
'season': 'Series 1',
'thumbnail': r're:^https?://cdn\.iview\.abc\.net\.au/thumbs/.+\.jpg',
},
'playlist_count': 17,
}]
def _real_extract(self, url):
show_id = self._match_id(url)
webpage = self._download_webpage(url, show_id)
webpage_data = self._search_regex(
r'window\.__INITIAL_STATE__\s*=\s*[\'"](.+?)[\'"]\s*;',
webpage, 'initial state')
video_data = self._parse_json(
unescapeHTML(webpage_data).encode().decode('unicode_escape'), show_id)
video_data = video_data['route']['pageData']['_embedded']
video_data = self._search_json(
r'window\.__INITIAL_STATE__\s*=\s*[\'"]', webpage, 'initial state', show_id,
transform_source=lambda x: x.encode().decode('unicode_escape'),
end_pattern=r'[\'"]\s*;')['route']['pageData']['_embedded']
highlight = try_get(video_data, lambda x: x['highlightVideo']['shareUrl'])
if not self._yes_playlist(show_id, bool(highlight), video_label='highlight video'):

View File

@@ -9,12 +9,12 @@ import re
import struct
import time
import urllib.parse
import urllib.request
import urllib.response
import uuid
from .common import InfoExtractor
from ..aes import aes_ecb_decrypt
from ..networking import RequestHandler, Response
from ..networking.exceptions import TransportError
from ..utils import (
ExtractorError,
OnDemandPagedList,
@@ -26,37 +26,36 @@ from ..utils import (
traverse_obj,
update_url_query,
)
from ..utils.networking import clean_proxies
def add_opener(ydl, handler): # FIXME: Create proper API in .networking
"""Add a handler for opening URLs, like _download_webpage"""
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
rh = ydl._request_director.handlers['Urllib']
if 'abematv-license' in rh._SUPPORTED_URL_SCHEMES:
return
headers = ydl.params['http_headers'].copy()
proxies = ydl.proxies.copy()
clean_proxies(proxies, headers)
opener = rh._get_instance(cookiejar=ydl.cookiejar, proxies=proxies)
assert isinstance(opener, urllib.request.OpenerDirector)
opener.add_handler(handler)
rh._SUPPORTED_URL_SCHEMES = (*rh._SUPPORTED_URL_SCHEMES, 'abematv-license')
class AbemaLicenseRH(RequestHandler):
_SUPPORTED_URL_SCHEMES = ('abematv-license',)
_SUPPORTED_PROXY_SCHEMES = None
_SUPPORTED_FEATURES = None
RH_NAME = 'abematv_license'
_STRTABLE = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
_HKEY = b'3AF0298C219469522A313570E8583005A642E73EDD58E3EA2FB7339D3DF1597E'
class AbemaLicenseHandler(urllib.request.BaseHandler):
handler_order = 499
STRTABLE = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
HKEY = b'3AF0298C219469522A313570E8583005A642E73EDD58E3EA2FB7339D3DF1597E'
def __init__(self, ie: 'AbemaTVIE'):
# the protocol that this should really handle is 'abematv-license://'
# abematv_license_open is just a placeholder for development purposes
# ref. https://github.com/python/cpython/blob/f4c03484da59049eb62a9bf7777b963e2267d187/Lib/urllib/request.py#L510
setattr(self, 'abematv-license_open', getattr(self, 'abematv_license_open', None))
def __init__(self, *, ie: 'AbemaTVIE', **kwargs):
super().__init__(**kwargs)
self.ie = ie
def _send(self, request):
url = request.url
ticket = urllib.parse.urlparse(url).netloc
try:
response_data = self._get_videokey_from_ticket(ticket)
except ExtractorError as e:
raise TransportError(cause=e.cause) from e
except (IndexError, KeyError, TypeError) as e:
raise TransportError(cause=repr(e)) from e
return Response(
io.BytesIO(response_data), url,
headers={'Content-Length': str(len(response_data))})
def _get_videokey_from_ticket(self, ticket):
to_show = self.ie.get_param('verbose', False)
media_token = self.ie._get_media_token(to_show=to_show)
@@ -72,25 +71,17 @@ class AbemaLicenseHandler(urllib.request.BaseHandler):
'Content-Type': 'application/json',
})
res = decode_base_n(license_response['k'], table=self.STRTABLE)
res = decode_base_n(license_response['k'], table=self._STRTABLE)
encvideokey = bytes_to_intlist(struct.pack('>QQ', res >> 64, res & 0xffffffffffffffff))
h = hmac.new(
binascii.unhexlify(self.HKEY),
binascii.unhexlify(self._HKEY),
(license_response['cid'] + self.ie._DEVICE_ID).encode(),
digestmod=hashlib.sha256)
enckey = bytes_to_intlist(h.digest())
return intlist_to_bytes(aes_ecb_decrypt(encvideokey, enckey))
def abematv_license_open(self, url):
url = url.get_full_url() if isinstance(url, urllib.request.Request) else url
ticket = urllib.parse.urlparse(url).netloc
response_data = self._get_videokey_from_ticket(ticket)
return urllib.response.addinfourl(io.BytesIO(response_data), headers={
'Content-Length': str(len(response_data)),
}, url=url, code=200)
class AbemaTVBaseIE(InfoExtractor):
_NETRC_MACHINE = 'abematv'
@@ -139,7 +130,7 @@ class AbemaTVBaseIE(InfoExtractor):
if self._USERTOKEN:
return self._USERTOKEN
add_opener(self._downloader, AbemaLicenseHandler(self))
self._downloader._request_director.add_handler(AbemaLicenseRH(ie=self, logger=None))
username, _ = self._get_login_info()
auth_cache = username and self.cache.load(self._NETRC_MACHINE, username, min_ver='2024.01.19')
@@ -368,6 +359,7 @@ class AbemaTVIE(AbemaTVBaseIE):
info['episode_number'] = epis if epis < 2000 else None
is_live, m3u8_url = False, None
availability = 'public'
if video_type == 'now-on-air':
is_live = True
channel_url = 'https://api.abema.io/v1/channels'
@@ -385,10 +377,10 @@ class AbemaTVIE(AbemaTVBaseIE):
f'https://api.abema.io/v1/video/programs/{video_id}', video_id,
note='Checking playability',
headers=headers)
ondemand_types = traverse_obj(api_response, ('terms', ..., 'onDemandType'))
if 3 not in ondemand_types:
if not traverse_obj(api_response, ('label', 'free', {bool})):
# cannot acquire decryption key for these streams
self.report_warning('This is a premium-only stream')
availability = 'premium_only'
info.update(traverse_obj(api_response, {
'series': ('series', 'title'),
'season': ('season', 'name'),
@@ -408,6 +400,7 @@ class AbemaTVIE(AbemaTVBaseIE):
headers=headers)
if not traverse_obj(api_response, ('slot', 'flags', 'timeshiftFree'), default=False):
self.report_warning('This is a premium-only stream')
availability = 'premium_only'
m3u8_url = f'https://vod-abematv.akamaized.net/slot/{video_id}/playlist.m3u8'
else:
@@ -425,6 +418,7 @@ class AbemaTVIE(AbemaTVBaseIE):
'description': description,
'formats': formats,
'is_live': is_live,
'availability': availability,
})
return info

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class AcademicEarthCourseIE(InfoExtractor):
_VALID_URL = r'^https?://(?:www\.)?academicearth\.org/playlists/(?P<id>[^?#/]+)'
_VALID_URL = r'https?://(?:www\.)?academicearth\.org/playlists/(?P<id>[^?#/]+)'
IE_NAME = 'AcademicEarth:Course'
_TEST = {
'url': 'http://academicearth.org/playlists/laws-of-nature/',

View File

@@ -16,6 +16,7 @@ from ..utils import (
float_or_none,
int_or_none,
intlist_to_bytes,
join_nonempty,
long_to_bytes,
parse_iso8601,
pkcs1pad,
@@ -48,9 +49,9 @@ class ADNBaseIE(InfoExtractor):
class ADNIE(ADNBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?:animation|anime)digitalnetwork\.(?P<lang>fr|de)/video/[^/?#]+/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?animationdigitalnetwork\.com/(?:(?P<lang>de)/)?video/[^/?#]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://animationdigitalnetwork.fr/video/fruits-basket/9841-episode-1-a-ce-soir',
'url': 'https://animationdigitalnetwork.com/video/558-fruits-basket/9841-episode-1-a-ce-soir',
'md5': '1c9ef066ceb302c86f80c2b371615261',
'info_dict': {
'id': '9841',
@@ -70,10 +71,7 @@ class ADNIE(ADNBaseIE):
},
'skip': 'Only available in French and German speaking Europe',
}, {
'url': 'http://animedigitalnetwork.fr/video/blue-exorcist-kyoto-saga/7778-episode-1-debut-des-hostilites',
'only_matching': True,
}, {
'url': 'https://animationdigitalnetwork.de/video/the-eminence-in-shadow/23550-folge-1',
'url': 'https://animationdigitalnetwork.com/de/video/973-the-eminence-in-shadow/23550-folge-1',
'md5': '5c5651bf5791fa6fcd7906012b9d94e8',
'info_dict': {
'id': '23550',
@@ -166,7 +164,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
'username': username,
})) or {}).get('accessToken')
if access_token:
self._HEADERS = {'authorization': 'Bearer ' + access_token}
self._HEADERS['Authorization'] = f'Bearer {access_token}'
except ExtractorError as e:
message = None
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
@@ -177,6 +175,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
def _real_extract(self, url):
lang, video_id = self._match_valid_url(url).group('lang', 'id')
self._HEADERS['X-Target-Distribution'] = lang or 'fr'
video_base_url = self._PLAYER_BASE_URL + f'video/{video_id}/'
player = self._download_json(
video_base_url + 'configuration', video_id,
@@ -217,7 +216,6 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
links_data = self._download_json(
links_url, video_id, 'Downloading links JSON metadata', headers={
'X-Player-Token': authorization,
'X-Target-Distribution': lang,
**self._HEADERS,
}, query={
'freeWithAds': 'true',
@@ -256,6 +254,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
load_balancer_data = self._download_json(
load_balancer_url, video_id,
f'Downloading {format_id} {quality} JSON metadata',
headers=self._HEADERS,
fatal=False) or {}
m3u8_url = load_balancer_data.get('location')
if not m3u8_url:
@@ -276,7 +275,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
video = (self._download_json(
self._API_BASE_URL + f'video/{video_id}', video_id,
'Downloading additional video metadata', fatal=False) or {}).get('video') or {}
'Downloading additional video metadata', fatal=False, headers=self._HEADERS) or {}).get('video') or {}
show = video.get('show') or {}
return {
@@ -298,9 +297,9 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
class ADNSeasonIE(ADNBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?:animation|anime)digitalnetwork\.(?P<lang>fr|de)/video/(?P<id>[^/?#]+)/?(?:$|[#?])'
_VALID_URL = r'https?://(?:www\.)?animationdigitalnetwork\.com/(?:(?P<lang>de)/)?video/(?P<id>\d+)[^/?#]*/?(?:$|[#?])'
_TESTS = [{
'url': 'https://animationdigitalnetwork.fr/video/tokyo-mew-mew-new',
'url': 'https://animationdigitalnetwork.com/video/911-tokyo-mew-mew-new',
'playlist_count': 12,
'info_dict': {
'id': '911',
@@ -311,24 +310,22 @@ class ADNSeasonIE(ADNBaseIE):
def _real_extract(self, url):
lang, video_show_slug = self._match_valid_url(url).group('lang', 'id')
self._HEADERS['X-Target-Distribution'] = lang or 'fr'
show = self._download_json(
f'{self._API_BASE_URL}show/{video_show_slug}/', video_show_slug,
'Downloading show JSON metadata', headers=self._HEADERS)['show']
show_id = str(show['id'])
episodes = self._download_json(
f'{self._API_BASE_URL}video/show/{show_id}', video_show_slug,
'Downloading episode list', headers={
'X-Target-Distribution': lang,
**self._HEADERS,
}, query={
'Downloading episode list', headers=self._HEADERS, query={
'order': 'asc',
'limit': '-1',
})
def entries():
for episode_id in traverse_obj(episodes, ('videos', ..., 'id', {str_or_none})):
yield self.url_result(
f'https://animationdigitalnetwork.{lang}/video/{video_show_slug}/{episode_id}',
ADNIE, episode_id)
yield self.url_result(join_nonempty(
'https://animationdigitalnetwork.com', lang, 'video',
video_show_slug, episode_id, delim='/'), ADNIE, episode_id)
return self.playlist_result(entries(), show_id, show.get('title'))

View File

@@ -1,6 +1,7 @@
import functools
from .common import InfoExtractor
from ..networking import Request
from ..utils import (
ExtractorError,
OnDemandPagedList,
@@ -58,6 +59,13 @@ class AfreecaTVBaseIE(InfoExtractor):
f'Unable to login: {self.IE_NAME} said: {error}',
expected=True)
def _call_api(self, endpoint, display_id, data=None, headers=None, query=None):
return self._download_json(Request(
f'https://api.m.afreecatv.com/{endpoint}',
data=data, headers=headers, query=query,
extensions={'legacy_ssl': True}), display_id,
'Downloading API JSON', 'Unable to download API JSON')
class AfreecaTVIE(AfreecaTVBaseIE):
IE_NAME = 'afreecatv'
@@ -184,12 +192,12 @@ class AfreecaTVIE(AfreecaTVBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'https://api.m.afreecatv.com/station/video/a/view', video_id,
headers={'Referer': url}, data=urlencode_postdata({
data = self._call_api(
'station/video/a/view', video_id, headers={'Referer': url},
data=urlencode_postdata({
'nTitleNo': video_id,
'nApiLevel': 10,
}), impersonate=True)['data']
}))['data']
error_code = traverse_obj(data, ('code', {int}))
if error_code == -6221:
@@ -267,9 +275,9 @@ class AfreecaTVCatchStoryIE(AfreecaTVBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'https://api.m.afreecatv.com/catchstory/a/view', video_id, headers={'Referer': url},
query={'aStoryListIdx': '', 'nStoryIdx': video_id}, impersonate=True)
data = self._call_api(
'catchstory/a/view', video_id, headers={'Referer': url},
query={'aStoryListIdx': '', 'nStoryIdx': video_id})
return self.playlist_result(self._entries(data), video_id)

View File

@@ -231,7 +231,7 @@ class ARDIE(InfoExtractor):
class ARDBetaMediathekIE(InfoExtractor):
IE_NAME = 'ARDMediathek'
_VALID_URL = r'''(?x)https://
_VALID_URL = r'''(?x)https?://
(?:(?:beta|www)\.)?ardmediathek\.de/
(?:[^/]+/)?
(?:player|live|video)/
@@ -470,7 +470,7 @@ class ARDBetaMediathekIE(InfoExtractor):
class ARDMediathekCollectionIE(InfoExtractor):
_VALID_URL = r'''(?x)https://
_VALID_URL = r'''(?x)https?://
(?:(?:beta|www)\.)?ardmediathek\.de/
(?:[^/?#]+/)?
(?P<playlist>sendung|serie|sammlung)/

View File

@@ -101,9 +101,10 @@ class AsobiStageIE(InfoExtractor):
self._HEADERS['Authorization'] = f'Bearer {token}'
def _real_extract(self, url):
video_id, event, type_, slug = self._match_valid_url(url).group('id', 'event', 'type', 'slug')
webpage, urlh = self._download_webpage_handle(url, self._match_id(url))
video_id, event, type_, slug = self._match_valid_url(urlh.url).group('id', 'event', 'type', 'slug')
video_type = {'archive': 'archives', 'player': 'broadcasts'}[type_]
webpage = self._download_webpage(url, video_id)
event_data = traverse_obj(
self._search_nextjs_data(webpage, video_id, default={}),
('props', 'pageProps', 'eventCMSData', {

View File

@@ -4,9 +4,13 @@ import urllib.parse
from .common import InfoExtractor
from ..utils import (
InAdvancePagedList,
determine_ext,
format_field,
int_or_none,
join_nonempty,
traverse_obj,
unified_timestamp,
url_or_none,
)
@@ -30,6 +34,7 @@ class BanByeBaseIE(InfoExtractor):
class BanByeIE(BanByeBaseIE):
_VALID_URL = r'https?://(?:www\.)?banbye\.com/(?:en/)?watch/(?P<id>[\w-]+)'
_TESTS = [{
# ['src']['mp4']['levels'] direct mp4 urls only
'url': 'https://banbye.com/watch/v_ytfmvkVYLE8T',
'md5': '2f4ea15c5ca259a73d909b2cfd558eb5',
'info_dict': {
@@ -58,6 +63,7 @@ class BanByeIE(BanByeBaseIE):
},
'playlist_mincount': 9,
}, {
# ['src']['mp4']['levels'] direct mp4 urls only
'url': 'https://banbye.com/watch/v_kb6_o1Kyq-CD',
'info_dict': {
'id': 'v_kb6_o1Kyq-CD',
@@ -77,6 +83,48 @@ class BanByeIE(BanByeBaseIE):
'view_count': int,
'comment_count': int,
},
}, {
# ['src']['hls']['levels'] variant m3u8 urls only; master m3u8 is 404
'url': 'https://banbye.com/watch/v_a_gPFuC9LoW5',
'info_dict': {
'id': 'v_a_gPFuC9LoW5',
'ext': 'mp4',
'title': 'md5:183524056bebdfa245fd6d214f63c0fe',
'description': 'md5:943ac87287ca98d28d8b8797719827c6',
'uploader': 'wRealu24',
'channel_id': 'ch_wrealu24',
'channel_url': 'https://banbye.com/channel/ch_wrealu24',
'upload_date': '20231113',
'timestamp': 1699874062,
'view_count': int,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'thumbnail': 'https://cdn.banbye.com/video/v_a_gPFuC9LoW5/96.webp',
'tags': ['jaszczur', 'sejm', 'lewica', 'polska', 'ukrainizacja', 'pierwszeposiedzeniesejmu'],
},
'expected_warnings': ['Failed to download m3u8'],
}, {
# ['src']['hls']['masterPlaylist'] m3u8 only
'url': 'https://banbye.com/watch/v_B0rsKWsr-aaa',
'info_dict': {
'id': 'v_B0rsKWsr-aaa',
'ext': 'mp4',
'title': 'md5:00b254164b82101b3f9e5326037447ed',
'description': 'md5:3fd8b48aa81954ba024bc60f5de6e167',
'uploader': 'PSTV Piotr Szlachtowicz ',
'channel_id': 'ch_KV9EVObkB9wB',
'channel_url': 'https://banbye.com/channel/ch_KV9EVObkB9wB',
'upload_date': '20240629',
'timestamp': 1719646816,
'duration': 2377,
'view_count': int,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'thumbnail': 'https://cdn.banbye.com/video/v_B0rsKWsr-aaa/96.webp',
'tags': ['Biden', 'Trump', 'Wybory', 'USA'],
},
}]
def _real_extract(self, url):
@@ -91,11 +139,24 @@ class BanByeIE(BanByeBaseIE):
'id': f'{quality}p',
'url': f'{self._CDN_BASE}/video/{video_id}/{quality}.webp',
} for quality in [48, 96, 144, 240, 512, 1080]]
formats = [{
'format_id': f'http-{quality}p',
'quality': quality,
'url': f'{self._CDN_BASE}/video/{video_id}/{quality}.mp4',
} for quality in data['quality']]
formats = []
url_data = self._download_json(f'{self._API_BASE}/videos/{video_id}/url', video_id, data=b'')
if master_url := traverse_obj(url_data, ('src', 'hls', 'masterPlaylist', {url_or_none})):
formats = self._extract_m3u8_formats(master_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
for format_id, format_url in traverse_obj(url_data, (
'src', ('mp4', 'hls'), 'levels', {dict.items}, lambda _, v: url_or_none(v[1]))):
ext = determine_ext(format_url)
is_hls = ext == 'm3u8'
formats.append({
'url': format_url,
'ext': 'mp4' if is_hls else ext,
'format_id': join_nonempty(is_hls and 'hls', format_id),
'protocol': 'm3u8_native' if is_hls else 'https',
'height': int_or_none(format_id),
})
self._remove_duplicate_formats(formats)
return {
'id': video_id,

View File

@@ -1,3 +1,5 @@
import functools
import json
import random
import re
import time
@@ -6,7 +8,9 @@ from .common import InfoExtractor
from ..utils import (
KNOWN_EXTENSIONS,
ExtractorError,
extract_attributes,
float_or_none,
get_element_html_by_id,
int_or_none,
parse_filesize,
str_or_none,
@@ -17,6 +21,7 @@ from ..utils import (
url_or_none,
urljoin,
)
from ..utils.traversal import traverse_obj
class BandcampIE(InfoExtractor):
@@ -459,7 +464,7 @@ class BandcampUserIE(InfoExtractor):
},
}, {
'url': 'https://coldworldofficial.bandcamp.com/music',
'playlist_mincount': 10,
'playlist_mincount': 7,
'info_dict': {
'id': 'coldworldofficial',
'title': 'Discography of coldworldofficial',
@@ -473,12 +478,19 @@ class BandcampUserIE(InfoExtractor):
},
}]
def _yield_items(self, webpage):
yield from (
re.findall(r'<li data-item-id=["\'][^>]+>\s*<a href=["\'](?![^"\'/]*?/merch)([^"\']+)', webpage)
or re.findall(r'<div[^>]+trackTitle["\'][^"\']+["\']([^"\']+)', webpage))
yield from traverse_obj(webpage, (
{functools.partial(get_element_html_by_id, 'music-grid')}, {extract_attributes},
'data-client-items', {json.loads}, ..., 'page_url', {str}))
def _real_extract(self, url):
uploader = self._match_id(url)
webpage = self._download_webpage(url, uploader)
discography_data = (re.findall(r'<li data-item-id=["\'][^>]+>\s*<a href=["\'](?![^"\'/]*?/merch)([^"\']+)', webpage)
or re.findall(r'<div[^>]+trackTitle["\'][^"\']+["\']([^"\']+)', webpage))
return self.playlist_from_matches(
discography_data, uploader, f'Discography of {uploader}', getter=lambda x: urljoin(url, x))
self._yield_items(webpage), uploader, f'Discography of {uploader}',
getter=functools.partial(urljoin, url))

View File

@@ -0,0 +1,68 @@
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
parse_iso8601,
traverse_obj,
)
class BeaconTvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beacon\.tv/content/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://beacon.tv/content/welcome-to-beacon',
'md5': 'b3f5932d437f288e662f10f3bfc5bd04',
'info_dict': {
'id': 'welcome-to-beacon',
'ext': 'mp4',
'upload_date': '20240509',
'description': 'md5:ea2bd32e71acf3f9fca6937412cc3563',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/I4CkkEvN/poster.jpg?width=720',
'title': 'Your home for Critical Role!',
'timestamp': 1715227200,
'duration': 105.494,
},
}, {
'url': 'https://beacon.tv/content/re-slayers-take-trailer',
'md5': 'd879b091485dbed2245094c8152afd89',
'info_dict': {
'id': 're-slayers-take-trailer',
'ext': 'mp4',
'title': 'The Re-Slayers Take | Official Trailer',
'timestamp': 1715189040,
'upload_date': '20240508',
'duration': 53.249,
'thumbnail': 'https://cdn.jwplayer.com/v2/media/PW5ApIw3/poster.jpg?width=720',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
content_data = traverse_obj(self._search_nextjs_data(webpage, video_id), (
'props', 'pageProps', '__APOLLO_STATE__',
lambda k, v: k.startswith('Content:') and v['slug'] == video_id, any))
if not content_data:
raise ExtractorError('Failed to extract content data')
jwplayer_data = traverse_obj(content_data, (
(('contentVideo', 'video', 'videoData'),
('contentPodcast', 'podcast', 'audioData')), {json.loads}, {dict}, any))
if not jwplayer_data:
if content_data.get('contentType') not in ('videoPodcast', 'video', 'podcast'):
raise ExtractorError('Content is not a video/podcast', expected=True)
if traverse_obj(content_data, ('contentTier', '__ref')) != 'MemberTier:65b258d178f89be87b4dc0a4':
self.raise_login_required('This video/podcast is for members only')
raise ExtractorError('Failed to extract content')
return {
**self._parse_jwplayer_data(jwplayer_data, video_id),
**traverse_obj(content_data, {
'title': ('title', {str}),
'description': ('description', {str}),
'timestamp': ('publishedAt', {parse_iso8601}),
}),
}

View File

@@ -46,6 +46,7 @@ from ..utils import (
class BilibiliBaseIE(InfoExtractor):
_HEADERS = {'Referer': 'https://www.bilibili.com/'}
_FORMAT_ID_RE = re.compile(r'-(\d+)\.m4s\?')
_WBI_KEY_CACHE_TIMEOUT = 30 # exact expire timeout is unclear, use 30s for one session
_wbi_key_cache = {}
@@ -192,7 +193,7 @@ class BilibiliBaseIE(InfoExtractor):
video_info = self._download_json(
'https://api.bilibili.com/x/player/v2', video_id,
query={'aid': aid, 'cid': cid} if aid else {'bvid': video_id, 'cid': cid},
note=f'Extracting subtitle info {cid}')
note=f'Extracting subtitle info {cid}', headers=self._HEADERS)
if traverse_obj(video_info, ('data', 'need_login_subtitle')):
self.report_warning(
f'Subtitles are only available when logged in. {self._login_hint()}', only_once=True)
@@ -207,7 +208,7 @@ class BilibiliBaseIE(InfoExtractor):
def _get_chapters(self, aid, cid):
chapters = aid and cid and self._download_json(
'https://api.bilibili.com/x/player/v2', aid, query={'aid': aid, 'cid': cid},
note='Extracting chapters', fatal=False)
note='Extracting chapters', fatal=False, headers=self._HEADERS)
return traverse_obj(chapters, ('data', 'view_points', ..., {
'title': 'content',
'start_time': 'from',
@@ -298,7 +299,7 @@ class BilibiliBaseIE(InfoExtractor):
class BiliBiliIE(BilibiliBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/(?:video/|festival/\w+\?(?:[^#]*&)?bvid=)[aAbB][vV](?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/(?:video/|festival/[^/?#]+\?(?:[^#]*&)?bvid=)[aAbB][vV](?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.bilibili.com/video/BV13x41117TL',
@@ -622,6 +623,10 @@ class BiliBiliIE(BilibiliBaseIE):
'ext': 'mp4',
},
'skip': 'geo-restricted',
}, {
'note': 'has - in the last path segment of the url',
'url': 'https://www.bilibili.com/festival/bh3-7th?bvid=BV1tr4y1f7p2&',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -1017,8 +1022,6 @@ class BiliBiliBangumiSeasonIE(BilibiliBaseIE):
class BilibiliCheeseBaseIE(BilibiliBaseIE):
_HEADERS = {'Referer': 'https://www.bilibili.com/'}
def _extract_episode(self, season_info, ep_id):
episode_info = traverse_obj(season_info, (
'episodes', lambda _, v: v['id'] == int(ep_id)), get_all=False)
@@ -1848,7 +1851,7 @@ class BiliBiliPlayerIE(InfoExtractor):
class BiliIntlBaseIE(InfoExtractor):
_API_URL = 'https://api.bilibili.tv/intl/gateway'
_NETRC_MACHINE = 'biliintl'
_HEADERS = {'Referer': 'https://www.bilibili.com/'}
_HEADERS = {'Referer': 'https://www.bilibili.tv/'}
def _call_api(self, endpoint, *args, **kwargs):
json = self._download_json(self._API_URL + endpoint, *args, **kwargs)

View File

@@ -12,7 +12,7 @@ from ..utils.traversal import traverse_obj
class BoxIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^.]+\.)?app\.box\.com/s/(?P<shared_name>[^/?#]+)(?:/file/(?P<id>\d+))?'
_VALID_URL = r'https?://(?:[^.]+\.)?(?P<service>app|ent)\.box\.com/s/(?P<shared_name>[^/?#]+)(?:/file/(?P<id>\d+))?'
_TESTS = [{
'url': 'https://mlssoccer.app.box.com/s/0evd2o3e08l60lr4ygukepvnkord1o1x/file/510727257538',
'md5': '1f81b2fd3960f38a40a3b8823e5fcd43',
@@ -38,10 +38,22 @@ class BoxIE(InfoExtractor):
'uploader_id': '239068974',
},
'params': {'skip_download': 'dash fragment too small'},
}, {
'url': 'https://thejacksonlaboratory.ent.box.com/s/2x09dm6vcg6y28o0oox1so4l0t8wzt6l/file/1536173056065',
'info_dict': {
'id': '1536173056065',
'ext': 'mp4',
'uploader_id': '18523128264',
'uploader': 'Lexi Hennigan',
'title': 'iPSC Symposium recording part 1.mp4',
'timestamp': 1716228343,
'upload_date': '20240520',
},
'params': {'skip_download': 'dash fragment too small'},
}]
def _real_extract(self, url):
shared_name, file_id = self._match_valid_url(url).groups()
shared_name, file_id, service = self._match_valid_url(url).group('shared_name', 'id', 'service')
webpage = self._download_webpage(url, file_id or shared_name)
if not file_id:
@@ -57,14 +69,14 @@ class BoxIE(InfoExtractor):
request_token = self._search_json(
r'Box\.config\s*=', webpage, 'Box config', file_id)['requestToken']
access_token = self._download_json(
'https://app.box.com/app-api/enduserapp/elements/tokens', file_id,
f'https://{service}.box.com/app-api/enduserapp/elements/tokens', file_id,
'Downloading token JSON metadata',
data=json.dumps({'fileIDs': [file_id]}).encode(), headers={
'Content-Type': 'application/json',
'X-Request-Token': request_token,
'X-Box-EndUser-API': 'sharedName=' + shared_name,
})[file_id]['read']
shared_link = 'https://app.box.com/s/' + shared_name
shared_link = f'https://{service}.box.com/s/{shared_name}'
f = self._download_json(
'https://api.box.com/2.0/files/' + file_id, file_id,
'Downloading file JSON metadata', headers={

View File

@@ -3,7 +3,7 @@ from ..utils import float_or_none, int_or_none, make_archive_id, traverse_obj
class CallinIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?callin\.com/(episode)/(?P<id>[-a-zA-Z]+)'
_VALID_URL = r'https?://(?:www\.)?callin\.com/episode/(?P<id>[-a-zA-Z]+)'
_TESTS = [{
'url': 'https://www.callin.com/episode/the-title-ix-regime-and-the-long-march-through-EBfXYSrsjc',
'info_dict': {

View File

@@ -1,4 +1,5 @@
import base64
import functools
import json
import re
import time
@@ -6,17 +7,24 @@ import urllib.parse
import xml.etree.ElementTree
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
join_nonempty,
js_to_json,
mimetype2ext,
orderedSet,
parse_iso8601,
replace_extension,
smuggle_url,
strip_or_none,
traverse_obj,
try_get,
update_url,
url_basename,
url_or_none,
)
@@ -149,6 +157,7 @@ class CBCIE(InfoExtractor):
class CBCPlayerIE(InfoExtractor):
IE_NAME = 'cbc.ca:player'
_VALID_URL = r'(?:cbcplayer:|https?://(?:www\.)?cbc\.ca/(?:player/play/(?:video/)?|i/caffeine/syndicate/\?mediaId=))(?P<id>(?:\d\.)?\d+)'
_GEO_COUNTRIES = ['CA']
_TESTS = [{
'url': 'http://www.cbc.ca/player/play/2683190193',
'md5': '64d25f841ddf4ddb28a235338af32e2c',
@@ -172,21 +181,20 @@ class CBCPlayerIE(InfoExtractor):
'description': 'md5:dd3b692f0a139b0369943150bd1c46a9',
'timestamp': 1425704400,
'upload_date': '20150307',
'uploader': 'CBCC-NEW',
'thumbnail': 'http://thumbnails.cbc.ca/maven_legacy/thumbnails/sonali-karnick-220.jpg',
'thumbnail': 'https://i.cbc.ca/ais/1.2985700,1717262248558/full/max/0/default.jpg',
'chapters': [],
'duration': 494.811,
'categories': ['AudioMobile/All in a Weekend Montreal'],
'tags': 'count:8',
'categories': ['All in a Weekend Montreal'],
'tags': 'count:11',
'location': 'Quebec',
'series': 'All in a Weekend Montreal',
'season': 'Season 2015',
'season_number': 2015,
'media_type': 'Excerpt',
'genres': ['Other'],
},
}, {
'url': 'http://www.cbc.ca/i/caffeine/syndicate/?mediaId=2164402062',
'md5': '33fcd8f6719b9dd60a5e73adcb83b9f6',
'info_dict': {
'id': '2164402062',
'ext': 'mp4',
@@ -194,107 +202,168 @@ class CBCPlayerIE(InfoExtractor):
'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.',
'timestamp': 1320410746,
'upload_date': '20111104',
'uploader': 'CBCC-NEW',
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/277/67/cancer_852x480_2164412612.jpg',
'thumbnail': 'https://i.cbc.ca/ais/1.1711287,1717139372111/full/max/0/default.jpg',
'chapters': [],
'duration': 186.867,
'series': 'CBC News: Windsor at 6:00',
'categories': ['News/Canada/Windsor'],
'categories': ['Windsor'],
'location': 'Windsor',
'tags': ['cancer'],
'creators': ['Allison Johnson'],
'tags': ['Cancer', 'News/Canada/Windsor', 'Windsor'],
'media_type': 'Excerpt',
'genres': ['News'],
},
'params': {'skip_download': 'm3u8'},
}, {
# Redirected from http://www.cbc.ca/player/AudioMobile/All%20in%20a%20Weekend%20Montreal/ID/2657632011/
'url': 'https://www.cbc.ca/player/play/1.2985700',
'md5': 'e5e708c34ae6fca156aafe17c43e8b75',
'info_dict': {
'id': '2657631896',
'id': '1.2985700',
'ext': 'mp3',
'title': 'CBC Montreal is organizing its first ever community hackathon!',
'description': 'The modern technology we tend to depend on so heavily, is never without it\'s share of hiccups and headaches. Next weekend - CBC Montreal will be getting members of the public for its first Hackathon.',
'timestamp': 1425704400,
'upload_date': '20150307',
'uploader': 'CBCC-NEW',
'thumbnail': 'http://thumbnails.cbc.ca/maven_legacy/thumbnails/sonali-karnick-220.jpg',
'thumbnail': 'https://i.cbc.ca/ais/1.2985700,1717262248558/full/max/0/default.jpg',
'chapters': [],
'duration': 494.811,
'categories': ['AudioMobile/All in a Weekend Montreal'],
'tags': 'count:8',
'categories': ['All in a Weekend Montreal'],
'tags': 'count:11',
'location': 'Quebec',
'series': 'All in a Weekend Montreal',
'season': 'Season 2015',
'season_number': 2015,
'media_type': 'Excerpt',
'genres': ['Other'],
},
}, {
'url': 'https://www.cbc.ca/player/play/1.1711287',
'md5': '33fcd8f6719b9dd60a5e73adcb83b9f6',
'info_dict': {
'id': '2164402062',
'id': '1.1711287',
'ext': 'mp4',
'title': 'Cancer survivor four times over',
'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.',
'timestamp': 1320410746,
'upload_date': '20111104',
'uploader': 'CBCC-NEW',
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/277/67/cancer_852x480_2164412612.jpg',
'thumbnail': 'https://i.cbc.ca/ais/1.1711287,1717139372111/full/max/0/default.jpg',
'chapters': [],
'duration': 186.867,
'series': 'CBC News: Windsor at 6:00',
'categories': ['News/Canada/Windsor'],
'categories': ['Windsor'],
'location': 'Windsor',
'tags': ['cancer'],
'creators': ['Allison Johnson'],
'tags': ['Cancer', 'News/Canada/Windsor', 'Windsor'],
'media_type': 'Excerpt',
'genres': ['News'],
},
'params': {'skip_download': 'm3u8'},
}, {
# Has subtitles
# These broadcasts expire after ~1 month, can find new test URL here:
# https://www.cbc.ca/player/news/TV%20Shows/The%20National/Latest%20Broadcast
'url': 'https://www.cbc.ca/player/play/1.7159484',
'md5': '6ed6cd0fc2ef568d2297ba68a763d455',
'url': 'https://www.cbc.ca/player/play/video/9.6424403',
'md5': '8025909eaffcf0adf59922904def9a5e',
'info_dict': {
'id': '2324213316001',
'id': '9.6424403',
'ext': 'mp4',
'title': 'The National | School boards sue social media giants',
'description': 'md5:4b4db69322fa32186c3ce426da07402c',
'timestamp': 1711681200,
'duration': 2743.400,
'subtitles': {'eng': [{'ext': 'vtt', 'protocol': 'm3u8_native'}]},
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/607/559/thumbnail.jpeg',
'uploader': 'CBCC-NEW',
'title': 'The National | N.W.T. wildfire emergency',
'description': 'md5:ada33d36d1df69347ed575905bfd496c',
'timestamp': 1718589600,
'duration': 2692.833,
'subtitles': {
'en-US': [{
'name': 'English Captions',
'url': 'https://cbchls.akamaized.net/delivery/news-shows/2024/06/17/NAT_JUN16-00-55-00/NAT_JUN16_cc.vtt',
}],
},
'thumbnail': 'https://i.cbc.ca/ais/6272b5c6-5e78-4c05-915d-0e36672e33d1,1714756287822/full/max/0/default.jpg',
'chapters': 'count:5',
'upload_date': '20240329',
'categories': 'count:4',
'upload_date': '20240617',
'categories': ['News', 'The National', 'The National Latest Broadcasts'],
'series': 'The National - Full Show',
'tags': 'count:1',
'creators': ['News'],
'tags': ['The National'],
'location': 'Canada',
'media_type': 'Full Program',
'genres': ['News'],
},
}, {
'url': 'https://www.cbc.ca/player/play/video/1.7194274',
'md5': '188b96cf6bdcb2540e178a6caa957128',
'info_dict': {
'id': '2334524995812',
'id': '1.7194274',
'ext': 'mp4',
'title': '#TheMoment a rare white spirit moose was spotted in Alberta',
'description': 'md5:18ae269a2d0265c5b0bbe4b2e1ac61a3',
'timestamp': 1714788791,
'duration': 77.678,
'subtitles': {'eng': [{'ext': 'vtt', 'protocol': 'm3u8_native'}]},
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/201/543/THE_MOMENT.jpg',
'uploader': 'CBCC-NEW',
'chapters': 'count:0',
'upload_date': '20240504',
'thumbnail': 'https://i.cbc.ca/ais/1.7194274,1717224990425/full/max/0/default.jpg',
'chapters': [],
'categories': 'count:3',
'series': 'The National',
'tags': 'count:15',
'creators': ['encoder'],
'tags': 'count:17',
'location': 'Canada',
'media_type': 'Excerpt',
'upload_date': '20240504',
'genres': ['News'],
},
}, {
'url': 'https://www.cbc.ca/player/play/video/9.6427282',
'info_dict': {
'id': '9.6427282',
'ext': 'mp4',
'title': 'Men\'s Soccer - Argentina vs Morocco',
'description': 'Argentina faces Morocco on the football pitch at Saint Etienne Stadium.',
'series': 'CBC Sports',
'media_type': 'Event Coverage',
'thumbnail': 'https://i.cbc.ca/ais/a4c5c0c2-99fa-4bd3-8061-5a63879c1b33,1718828053500/full/max/0/default.jpg',
'timestamp': 1721825400.0,
'upload_date': '20240724',
'duration': 10568.0,
'chapters': [],
'genres': [],
'tags': ['2024 Paris Olympic Games'],
'categories': ['Olympics Summer Soccer', 'Summer Olympics Replays', 'Summer Olympics Soccer Replays'],
'location': 'Canada',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.cbc.ca/player/play/video/9.6459530',
'md5': '6c1bb76693ab321a2e99c347a1d5ecbc',
'info_dict': {
'id': '9.6459530',
'ext': 'mp4',
'title': 'Parts of Jasper incinerated as wildfire rages',
'description': 'md5:6f1caa8d128ad3f629257ef5fecf0962',
'series': 'The National',
'media_type': 'Excerpt',
'thumbnail': 'https://i.cbc.ca/ais/507c0086-31a2-494d-96e4-bffb1048d045,1721953984375/full/max/0/default.jpg',
'timestamp': 1721964091.012,
'upload_date': '20240726',
'duration': 952.285,
'chapters': [],
'genres': [],
'tags': 'count:23',
'categories': ['News (FAST)', 'News', 'The National', 'TV News Shows', 'The National '],
},
}, {
'url': 'https://www.cbc.ca/player/play/video/9.6420651',
'md5': '71a850c2c6ee5e912de169f5311bb533',
'info_dict': {
'id': '9.6420651',
'ext': 'mp4',
'title': 'Is it a breath of fresh air? Measuring air quality in Edmonton',
'description': 'md5:3922b92cc8b69212d739bd9dd095b1c3',
'series': 'CBC News Edmonton',
'media_type': 'Excerpt',
'thumbnail': 'https://i.cbc.ca/ais/73c4ab9c-7ad4-46ee-bb9b-020fdc01c745,1718214547576/full/max/0/default.jpg',
'timestamp': 1718220065.768,
'upload_date': '20240612',
'duration': 286.086,
'chapters': [],
'genres': ['News'],
'categories': ['News', 'Edmonton'],
'tags': 'count:7',
'location': 'Edmonton',
},
}, {
'url': 'cbcplayer:1.7159484',
@@ -307,23 +376,113 @@ class CBCPlayerIE(InfoExtractor):
'only_matching': True,
}]
def _parse_param(self, asset_data, name):
return traverse_obj(asset_data, ('params', lambda _, v: v['name'] == name, 'value', {str}, any))
def _real_extract(self, url):
video_id = self._match_id(url)
if '.' in video_id:
webpage = self._download_webpage(f'https://www.cbc.ca/player/play/{video_id}', video_id)
video_id = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage,
'initial state', video_id)['video']['currentClip']['mediaId']
webpage = self._download_webpage(f'https://www.cbc.ca/player/play/{video_id}', video_id)
data = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)['video']['currentClip']
assets = traverse_obj(
data, ('media', 'assets', lambda _, v: url_or_none(v['key']) and v['type']))
if not assets and (media_id := traverse_obj(data, ('mediaId', {str}))):
# XXX: Deprecated; CBC is migrating off of ThePlatform
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': smuggle_url(
f'http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/{media_id}?mbr=true&formats=MPEG4,FLV,MP3', {
'force_smil_url': True,
}),
'id': media_id,
'_format_sort_fields': ('res', 'proto'), # Prioritize direct http formats over HLS
}
is_live = traverse_obj(data, ('media', 'streamType', {str})) == 'Live'
formats, subtitles = [], {}
for sub in traverse_obj(data, ('media', 'textTracks', lambda _, v: url_or_none(v['src']))):
subtitles.setdefault(sub.get('language') or 'und', []).append({
'url': sub['src'],
'name': sub.get('label'),
})
for asset in assets:
asset_key = asset['key']
asset_type = asset['type']
if asset_type != 'medianet':
self.report_warning(f'Skipping unsupported asset type "{asset_type}": {asset_key}')
continue
asset_data = self._download_json(asset_key, video_id, f'Downloading {asset_type} JSON')
ext = mimetype2ext(self._parse_param(asset_data, 'contentType'))
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
asset_data['url'], video_id, 'mp4', m3u8_id='hls', live=is_live)
formats.extend(fmts)
# Avoid slow/error-prone webvtt-over-m3u8 if direct https vtt is available
if not subtitles:
self._merge_subtitles(subs, target=subtitles)
if is_live or not fmts:
continue
# Check for direct https mp4 format
best_video_fmt = traverse_obj(fmts, (
lambda _, v: v.get('vcodec') != 'none' and v['tbr'], all,
{functools.partial(sorted, key=lambda x: x['tbr'])}, -1, {dict})) or {}
base_url = self._search_regex(
r'(https?://[^?#]+?/)hdntl=', best_video_fmt.get('url'), 'base url', default=None)
if not base_url or '/live/' in base_url:
continue
mp4_url = base_url + replace_extension(url_basename(best_video_fmt['url']), 'mp4')
if self._request_webpage(
HEADRequest(mp4_url), video_id, 'Checking for https format',
errnote=False, fatal=False):
formats.append({
**best_video_fmt,
'url': mp4_url,
'format_id': 'https-mp4',
'protocol': 'https',
'manifest_url': None,
'acodec': None,
})
else:
formats.append({
'url': asset_data['url'],
'ext': ext,
'vcodec': 'none' if self._parse_param(asset_data, 'mediaType') == 'audio' else None,
})
chapters = traverse_obj(data, (
'media', 'chapters', lambda _, v: float(v['startTime']) is not None, {
'start_time': ('startTime', {functools.partial(float_or_none, scale=1000)}),
'end_time': ('endTime', {functools.partial(float_or_none, scale=1000)}),
'title': ('name', {str}),
}))
# Filter out pointless single chapters with start_time==0 and no end_time
if len(chapters) == 1 and not (chapters[0].get('start_time') or chapters[0].get('end_time')):
chapters = []
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': smuggle_url(
f'http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/{video_id}?mbr=true&formats=MPEG4,FLV,MP3', {
'force_smil_url': True,
}),
**traverse_obj(data, {
'title': ('title', {str}),
'description': ('description', {str.strip}),
'thumbnail': ('image', 'url', {url_or_none}, {functools.partial(update_url, query=None)}),
'timestamp': ('publishedAt', {functools.partial(float_or_none, scale=1000)}),
'media_type': ('media', 'clipType', {str}),
'series': ('showName', {str}),
'season_number': ('media', 'season', {int_or_none}),
'duration': ('media', 'duration', {float_or_none}, {lambda x: None if is_live else x}),
'location': ('media', 'region', {str}),
'tags': ('tags', ..., 'name', {str}),
'genres': ('media', 'genre', all),
'categories': ('categories', ..., 'name', {str}),
}),
'id': video_id,
'_format_sort_fields': ('res', 'proto'), # Prioritize direct http formats over HLS
'formats': formats,
'subtitles': subtitles,
'chapters': chapters,
'is_live': is_live,
}
@@ -647,11 +806,11 @@ class CBCGemLiveIE(InfoExtractor):
'title': 'Ottawa',
'description': 'The live TV channel and local programming from Ottawa',
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/CBC_OTT_VMS/Live_Channel_Static_Images/Ottawa_2880x1620.jpg',
'is_live': True,
'live_status': 'is_live',
'id': 'AyqZwxRqh8EH',
'ext': 'mp4',
'timestamp': 1492106160,
'upload_date': '20170413',
'release_timestamp': 1492106160,
'release_date': '20170413',
'uploader': 'CBCC-NEW',
},
'skip': 'Live might have ended',
@@ -680,49 +839,84 @@ class CBCGemLiveIE(InfoExtractor):
'description': 'March 24, 2023 | President Bidens Ottawa visit ends with big pledges from both countries. Plus, Gwyneth Paltrow testifies in her ski collision trial.',
'live_status': 'is_live',
'thumbnail': r're:https://images.gem.cbc.ca/v1/cbc-gem/live/.*',
'timestamp': 1679706000,
'upload_date': '20230325',
'release_timestamp': 1679706000,
'release_date': '20230325',
},
'params': {'skip_download': True},
'skip': 'Live might have ended',
},
{ # event replay (medianetlive)
'url': 'https://gem.cbc.ca/live-event/42314',
'md5': '297a9600f554f2258aed01514226a697',
'info_dict': {
'id': '42314',
'ext': 'mp4',
'live_status': 'was_live',
'title': 'Women\'s Soccer - Canada vs New Zealand',
'description': 'md5:36200e5f1a70982277b5a6ecea86155d',
'thumbnail': r're:https://.+default\.jpg',
'release_timestamp': 1721917200,
'release_date': '20240725',
},
'params': {'skip_download': True},
'skip': 'Replay might no longer be available',
},
{ # event replay (medianetlive)
'url': 'https://gem.cbc.ca/live-event/43273',
'only_matching': True,
},
]
_GEO_COUNTRIES = ['CA']
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_info = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['data']
# Two types of metadata JSON
# Three types of video_info JSON: info in root, freeTv stream/item, event replay
if not video_info.get('formattedIdMedia'):
video_info = traverse_obj(
video_info, (('freeTv', ('streams', ...)), 'items', lambda _, v: v['key'] == video_id, {dict}),
get_all=False, default={})
if traverse_obj(video_info, ('event', 'key')) == video_id:
video_info = video_info['event']
else:
video_info = traverse_obj(video_info, (
('freeTv', ('streams', ...)), 'items',
lambda _, v: v['key'].partition('-')[0] == video_id, any)) or {}
video_stream_id = video_info.get('formattedIdMedia')
if not video_stream_id:
raise ExtractorError('Couldn\'t find video metadata, maybe this livestream is now offline', expected=True)
raise ExtractorError(
'Couldn\'t find video metadata, maybe this livestream is now offline', expected=True)
stream_data = self._download_json(
'https://services.radio-canada.ca/media/validation/v2/', video_id, query={
'appCode': 'mpx',
'connectionType': 'hd',
'deviceType': 'ipad',
'idMedia': video_stream_id,
'multibitrate': 'true',
'output': 'json',
'tech': 'hls',
'manifestType': 'desktop',
})
live_status = 'was_live' if video_info.get('isVodEnabled') else 'is_live'
release_timestamp = traverse_obj(video_info, ('airDate', {parse_iso8601}))
if live_status == 'is_live' and release_timestamp and release_timestamp > time.time():
formats = []
live_status = 'is_upcoming'
self.raise_no_formats('This livestream has not yet started', expected=True)
else:
stream_data = self._download_json(
'https://services.radio-canada.ca/media/validation/v2/', video_id, query={
'appCode': 'medianetlive',
'connectionType': 'hd',
'deviceType': 'ipad',
'idMedia': video_stream_id,
'multibitrate': 'true',
'output': 'json',
'tech': 'hls',
'manifestType': 'desktop',
})
formats = self._extract_m3u8_formats(
stream_data['url'], video_id, 'mp4', live=live_status == 'is_live')
return {
'id': video_id,
'formats': self._extract_m3u8_formats(stream_data['url'], video_id, 'mp4', live=True),
'is_live': True,
'formats': formats,
'live_status': live_status,
'release_timestamp': release_timestamp,
**traverse_obj(video_info, {
'title': 'title',
'description': 'description',
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('images', 'card', 'url'),
'timestamp': ('airDate', {parse_iso8601}),
}),
}

View File

@@ -1,63 +1,50 @@
from .common import InfoExtractor
from ..utils import traverse_obj
from .vidyard import VidyardBaseIE, VidyardIE
from ..utils import ExtractorError, make_archive_id, url_basename
class CellebriteIE(InfoExtractor):
class CellebriteIE(VidyardBaseIE):
_VALID_URL = r'https?://cellebrite\.com/(?:\w+)?/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://cellebrite.com/en/collect-data-from-android-devices-with-cellebrite-ufed/',
'info_dict': {
'id': '16025876',
'id': 'ZqmUss3dQfEMGpauambPuH',
'display_id': '16025876',
'ext': 'mp4',
'description': 'md5:174571cb97083fd1d457d75c684f4e2b',
'thumbnail': 'https://cellebrite.com/wp-content/uploads/2021/05/Chat-Capture-1024x559.png',
'title': 'Ask the Expert: Chat Capture - Collect Data from Android Devices in Cellebrite UFED',
'duration': 455,
'tags': [],
'description': 'md5:dee48fe12bbae5c01fe6a053f7676da4',
'thumbnail': 'https://cellebrite.com/wp-content/uploads/2021/05/Chat-Capture-1024x559.png',
'duration': 455.979,
'_old_archive_ids': ['cellebrite 16025876'],
},
}, {
'url': 'https://cellebrite.com/en/how-to-lawfully-collect-the-maximum-amount-of-data-from-android-devices/',
'info_dict': {
'id': '29018255',
'id': 'QV1U8a2yzcxigw7VFnqKyg',
'display_id': '29018255',
'ext': 'mp4',
'duration': 134,
'tags': [],
'description': 'md5:e9a3d124c7287b0b07bad2547061cacf',
'title': 'How to Lawfully Collect the Maximum Amount of Data From Android Devices',
'description': 'md5:0e943a9ac14c374d5d74faed634d773c',
'thumbnail': 'https://cellebrite.com/wp-content/uploads/2022/07/How-to-Lawfully-Collect-the-Maximum-Amount-of-Data-From-Android-Devices.png',
'title': 'Android Extractions Explained',
'duration': 134.315,
'_old_archive_ids': ['cellebrite 29018255'],
},
}]
def _get_formats_and_subtitles(self, json_data, display_id):
formats = [{'url': url} for url in traverse_obj(json_data, ('mp4', ..., 'url')) or []]
subtitles = {}
for url in traverse_obj(json_data, ('hls', ..., 'url')) or []:
fmt, sub = self._extract_m3u8_formats_and_subtitles(
url, display_id, ext='mp4', headers={'Referer': 'https://play.vidyard.com/'})
formats.extend(fmt)
self._merge_subtitles(sub, target=subtitles)
return formats, subtitles
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
slug = self._match_id(url)
webpage = self._download_webpage(url, slug)
vidyard_url = next(VidyardIE._extract_embed_urls(url, webpage), None)
if not vidyard_url:
raise ExtractorError('No Vidyard video embeds found on page')
player_uuid = self._search_regex(
r'<img\s[^>]*\bdata-uuid\s*=\s*"([^"\?]+)', webpage, 'player UUID')
json_data = self._download_json(
f'https://play.vidyard.com/player/{player_uuid}.json', display_id)['payload']['chapters'][0]
video_id = url_basename(vidyard_url)
info = self._process_video_json(self._fetch_video_json(video_id)['chapters'][0], video_id)
if info.get('display_id'):
info['_old_archive_ids'] = [make_archive_id(self, info['display_id'])]
if thumbnail := self._og_search_thumbnail(webpage, default=None):
info.setdefault('thumbnails', []).append({'url': thumbnail})
formats, subtitles = self._get_formats_and_subtitles(json_data['sources'], display_id)
return {
'id': str(json_data['videoId']),
'title': json_data.get('name') or self._og_search_title(webpage),
'formats': formats,
'subtitles': subtitles,
'description': json_data.get('description') or self._og_search_description(webpage),
'duration': json_data.get('seconds'),
'tags': json_data.get('tags'),
'thumbnail': self._og_search_thumbnail(webpage),
'http_headers': {'Referer': 'https://play.vidyard.com/'},
'description': self._og_search_description(webpage, default=None),
**info,
}

View File

@@ -36,7 +36,7 @@ class CHZZKLiveIE(InfoExtractor):
def _real_extract(self, url):
channel_id = self._match_id(url)
live_detail = self._download_json(
f'https://api.chzzk.naver.com/service/v2/channels/{channel_id}/live-detail', channel_id,
f'https://api.chzzk.naver.com/service/v3/channels/{channel_id}/live-detail', channel_id,
note='Downloading channel info', errnote='Unable to download channel info')['content']
if live_detail.get('status') == 'CLOSE':
@@ -106,12 +106,45 @@ class CHZZKVideoIE(InfoExtractor):
'upload_date': '20231219',
'view_count': int,
},
'skip': 'Replay video is expired',
}, {
# Manually uploaded video
'url': 'https://chzzk.naver.com/video/1980',
'info_dict': {
'id': '1980',
'ext': 'mp4',
'title': '※시청주의※한번보면 잊기 힘든 영상',
'channel': '라디유radiyu',
'channel_id': '68f895c59a1043bc5019b5e08c83a5c5',
'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 95,
'timestamp': 1703102631.722,
'upload_date': '20231220',
'view_count': int,
},
}, {
# Partner channel replay video
'url': 'https://chzzk.naver.com/video/2458',
'info_dict': {
'id': '2458',
'ext': 'mp4',
'title': '첫 방송',
'channel': '강지',
'channel_id': 'b5ed5db484d04faf4d150aedd362f34b',
'channel_is_verified': True,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4433,
'timestamp': 1703307460.214,
'upload_date': '20231223',
'view_count': int,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_meta = self._download_json(
f'https://api.chzzk.naver.com/service/v2/videos/{video_id}', video_id,
f'https://api.chzzk.naver.com/service/v3/videos/{video_id}', video_id,
note='Downloading video info', errnote='Unable to download video info')['content']
formats, subtitles = self._extract_mpd_formats_and_subtitles(
f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}', video_id,

View File

@@ -35,6 +35,7 @@ from ..networking import HEADRequest, Request
from ..networking.exceptions import (
HTTPError,
IncompleteRead,
TransportError,
network_exceptions,
)
from ..networking.impersonate import ImpersonateTarget
@@ -965,6 +966,9 @@ class InfoExtractor:
return False
content = self._webpage_read_content(urlh, url_or_request, video_id, note, errnote, fatal,
encoding=encoding, data=data)
if content is False:
assert not fatal
return False
return (content, urlh)
@staticmethod
@@ -1039,7 +1043,15 @@ class InfoExtractor:
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True,
prefix=None, encoding=None, data=None):
webpage_bytes = urlh.read()
try:
webpage_bytes = urlh.read()
except TransportError as err:
errmsg = f'{video_id}: Error reading response: {err.msg}'
if fatal:
raise ExtractorError(errmsg, cause=err)
self.report_warning(errmsg)
return False
if prefix is not None:
webpage_bytes = prefix + webpage_bytes
if self.get_param('dump_intermediate_pages', False):
@@ -2065,7 +2077,7 @@ class InfoExtractor:
has_drm = HlsFD._has_drm(m3u8_doc)
def format_url(url):
return url if re.match(r'^https?://', url) else urllib.parse.urljoin(m3u8_url, url)
return url if re.match(r'https?://', url) else urllib.parse.urljoin(m3u8_url, url)
if self.get_param('hls_split_discontinuity', False):
def _extract_m3u8_playlist_indices(manifest_url=None, m3u8_doc=None):
@@ -2800,11 +2812,11 @@ class InfoExtractor:
base_url_e = element.find(_add_ns('BaseURL'))
if try_call(lambda: base_url_e.text) is not None:
base_url = base_url_e.text + base_url
if re.match(r'^https?://', base_url):
if re.match(r'https?://', base_url):
break
if mpd_base_url and base_url.startswith('/'):
base_url = urllib.parse.urljoin(mpd_base_url, base_url)
elif mpd_base_url and not re.match(r'^https?://', base_url):
elif mpd_base_url and not re.match(r'https?://', base_url):
if not mpd_base_url.endswith('/'):
mpd_base_url += '/'
base_url = mpd_base_url + base_url
@@ -2894,7 +2906,7 @@ class InfoExtractor:
}
def location_key(location):
return 'url' if re.match(r'^https?://', location) else 'path'
return 'url' if re.match(r'https?://', location) else 'path'
if 'segment_urls' not in representation_ms_info and 'media' in representation_ms_info:
@@ -3150,7 +3162,7 @@ class InfoExtractor:
})
return formats, subtitles
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8_native', mpd_id=None, preference=None, quality=None):
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8_native', mpd_id=None, preference=None, quality=None, _headers=None):
def absolute_url(item_url):
return urljoin(base_url, item_url)
@@ -3174,11 +3186,11 @@ class InfoExtractor:
formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
preference=preference, quality=quality, fatal=False)
preference=preference, quality=quality, fatal=False, headers=_headers)
elif ext == 'mpd':
is_plain_url = False
formats = self._extract_mpd_formats(
full_url, video_id, mpd_id=mpd_id, fatal=False)
full_url, video_id, mpd_id=mpd_id, fatal=False, headers=_headers)
else:
is_plain_url = True
formats = [{
@@ -3272,6 +3284,8 @@ class InfoExtractor:
})
for f in media_info['formats']:
f.setdefault('http_headers', {})['Referer'] = base_url
if _headers:
f['http_headers'].update(_headers)
if media_info['formats'] or media_info['subtitles']:
entries.append(media_info)
return entries
@@ -3487,7 +3501,7 @@ class InfoExtractor:
continue
urls.add(source_url)
source_type = source.get('type') or ''
ext = mimetype2ext(source_type) or determine_ext(source_url)
ext = determine_ext(source_url, default_ext=mimetype2ext(source_type))
if source_type == 'hls' or ext == 'm3u8' or 'format=m3u8-aapl' in source_url:
formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native',

View File

@@ -1,6 +1,8 @@
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
parse_codecs,
try_get,
url_or_none,
urlencode_postdata,
@@ -12,6 +14,7 @@ class DigitalConcertHallIE(InfoExtractor):
IE_DESC = 'DigitalConcertHall extractor'
_VALID_URL = r'https?://(?:www\.)?digitalconcerthall\.com/(?P<language>[a-z]+)/(?P<type>film|concert|work)/(?P<id>[0-9]+)-?(?P<part>[0-9]+)?'
_OAUTH_URL = 'https://api.digitalconcerthall.com/v2/oauth2/token'
_USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15'
_ACCESS_TOKEN = None
_NETRC_MACHINE = 'digitalconcerthall'
_TESTS = [{
@@ -68,33 +71,42 @@ class DigitalConcertHallIE(InfoExtractor):
}]
def _perform_login(self, username, password):
token_response = self._download_json(
login_token = self._download_json(
self._OAUTH_URL,
None, 'Obtaining token', errnote='Unable to obtain token', data=urlencode_postdata({
'affiliate': 'none',
'grant_type': 'device',
'device_vendor': 'unknown',
# device_model 'Safari' gets split streams of 4K/HEVC video and lossless/FLAC audio
'device_model': 'unknown' if self._configuration_arg('prefer_combined_hls') else 'Safari',
'app_id': 'dch.webapp',
'app_version': '1.0.0',
'app_distributor': 'berlinphil',
'app_version': '1.84.0',
'client_secret': '2ySLN+2Fwb',
}), headers={
'Content-Type': 'application/x-www-form-urlencoded',
})
self._ACCESS_TOKEN = token_response['access_token']
'Accept': 'application/json',
'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
'User-Agent': self._USER_AGENT,
})['access_token']
try:
self._download_json(
login_response = self._download_json(
self._OAUTH_URL,
None, note='Logging in', errnote='Unable to login', data=urlencode_postdata({
'grant_type': 'password',
'username': username,
'password': password,
}), headers={
'Content-Type': 'application/x-www-form-urlencoded',
'Accept': 'application/json',
'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
'Referer': 'https://www.digitalconcerthall.com',
'Authorization': f'Bearer {self._ACCESS_TOKEN}',
'Authorization': f'Bearer {login_token}',
'User-Agent': self._USER_AGENT,
})
except ExtractorError:
self.raise_login_required(msg='Login info incorrect')
except ExtractorError as error:
if isinstance(error.cause, HTTPError) and error.cause.status == 401:
raise ExtractorError('Invalid username or password', expected=True)
raise
self._ACCESS_TOKEN = login_response['access_token']
def _real_initialize(self):
if not self._ACCESS_TOKEN:
@@ -108,11 +120,15 @@ class DigitalConcertHallIE(InfoExtractor):
'Accept': 'application/json',
'Authorization': f'Bearer {self._ACCESS_TOKEN}',
'Accept-Language': language,
'User-Agent': self._USER_AGENT,
})
formats = []
for m3u8_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})):
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', fatal=False))
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
for fmt in formats:
if fmt.get('format_note') and fmt.get('vcodec') == 'none':
fmt.update(parse_codecs(fmt['format_note']))
yield {
'id': video_id,
@@ -140,13 +156,15 @@ class DigitalConcertHallIE(InfoExtractor):
f'https://api.digitalconcerthall.com/v2/{api_type}/{video_id}', video_id, headers={
'Accept': 'application/json',
'Accept-Language': language,
'User-Agent': self._USER_AGENT,
'Authorization': f'Bearer {self._ACCESS_TOKEN}',
})
album_artists = traverse_obj(vid_info, ('_links', 'artist', ..., 'name'))
videos = [vid_info] if type_ == 'film' else traverse_obj(vid_info, ('_embedded', ..., ...))
if type_ == 'work':
videos = [videos[int(part) - 1]]
album_artists = traverse_obj(vid_info, ('_links', 'artist', ..., 'name', {str}))
thumbnail = traverse_obj(vid_info, (
'image', ..., {self._proto_relative_url}, {url_or_none},
{lambda x: x.format(width=0, height=0)}, any)) # NB: 0x0 is the original size

View File

@@ -1,115 +0,0 @@
import random
import string
import urllib.parse
from .discoverygo import DiscoveryGoBaseIE
from ..networking.exceptions import HTTPError
from ..utils import ExtractorError
class DiscoveryIE(DiscoveryGoBaseIE):
_VALID_URL = r'''(?x)https?://
(?P<site>
go\.discovery|
www\.
(?:
investigationdiscovery|
discoverylife|
animalplanet|
ahctv|
destinationamerica|
sciencechannel|
tlc
)|
watch\.
(?:
hgtv|
foodnetwork|
travelchannel|
diynetwork|
cookingchanneltv|
motortrend
)
)\.com/tv-shows/(?P<show_slug>[^/]+)/(?:video|full-episode)s/(?P<id>[^./?#]+)'''
_TESTS = [{
'url': 'https://go.discovery.com/tv-shows/cash-cab/videos/riding-with-matthew-perry',
'info_dict': {
'id': '5a2f35ce6b66d17a5026e29e',
'ext': 'mp4',
'title': 'Riding with Matthew Perry',
'description': 'md5:a34333153e79bc4526019a5129e7f878',
'duration': 84,
},
'params': {
'skip_download': True, # requires ffmpeg
},
}, {
'url': 'https://www.investigationdiscovery.com/tv-shows/final-vision/full-episodes/final-vision',
'only_matching': True,
}, {
'url': 'https://go.discovery.com/tv-shows/alaskan-bush-people/videos/follow-your-own-road',
'only_matching': True,
}, {
# using `show_slug` is important to get the correct video data
'url': 'https://www.sciencechannel.com/tv-shows/mythbusters-on-science/full-episodes/christmas-special',
'only_matching': True,
}]
_GEO_COUNTRIES = ['US']
_GEO_BYPASS = False
_API_BASE_URL = 'https://api.discovery.com/v1/'
def _real_extract(self, url):
site, show_slug, display_id = self._match_valid_url(url).groups()
access_token = None
cookies = self._get_cookies(url)
# prefer Affiliate Auth Token over Anonymous Auth Token
auth_storage_cookie = cookies.get('eosAf') or cookies.get('eosAn')
if auth_storage_cookie and auth_storage_cookie.value:
auth_storage = self._parse_json(urllib.parse.unquote(
urllib.parse.unquote(auth_storage_cookie.value)),
display_id, fatal=False) or {}
access_token = auth_storage.get('a') or auth_storage.get('access_token')
if not access_token:
access_token = self._download_json(
f'https://{site}.com/anonymous', display_id,
'Downloading token JSON metadata', query={
'authRel': 'authorization',
'client_id': '3020a40c2356a645b4b4',
'nonce': ''.join(random.choices(string.ascii_letters, k=32)),
'redirectUri': 'https://www.discovery.com/',
})['access_token']
headers = self.geo_verification_headers()
headers['Authorization'] = 'Bearer ' + access_token
try:
video = self._download_json(
self._API_BASE_URL + 'content/videos',
display_id, 'Downloading content JSON metadata',
headers=headers, query={
'embed': 'show.name',
'fields': 'authenticated,description.detailed,duration,episodeNumber,id,name,parental.rating,season.number,show,tags',
'slug': display_id,
'show_slug': show_slug,
})[0]
video_id = video['id']
stream = self._download_json(
self._API_BASE_URL + 'streaming/video/' + video_id,
display_id, 'Downloading streaming JSON metadata', headers=headers)
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status in (401, 403):
e_description = self._parse_json(
e.cause.response.read().decode(), display_id)['description']
if 'resource not available for country' in e_description:
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
if 'Authorized Networks' in e_description:
raise ExtractorError(
'This video is only available via cable service provider subscription that'
' is not currently supported. You may want to use --cookies.', expected=True)
raise ExtractorError(e_description)
raise
return self._extract_video_info(video, stream, display_id)

View File

@@ -1,171 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
determine_ext,
extract_attributes,
int_or_none,
parse_age_limit,
remove_end,
unescapeHTML,
url_or_none,
)
class DiscoveryGoBaseIE(InfoExtractor):
_VALID_URL_TEMPLATE = r'''(?x)https?://(?:www\.)?(?:
discovery|
investigationdiscovery|
discoverylife|
animalplanet|
ahctv|
destinationamerica|
sciencechannel|
tlc|
velocitychannel
)go\.com/%s(?P<id>[^/?#&]+)'''
def _extract_video_info(self, video, stream, display_id):
title = video['name']
if not stream:
if video.get('authenticated') is True:
raise ExtractorError(
'This video is only available via cable service provider subscription that'
' is not currently supported. You may want to use --cookies.', expected=True)
else:
raise ExtractorError('Unable to find stream')
STREAM_URL_SUFFIX = 'streamUrl'
formats = []
for stream_kind in ('', 'hds'):
suffix = STREAM_URL_SUFFIX.capitalize() if stream_kind else STREAM_URL_SUFFIX
stream_url = stream.get(f'{stream_kind}{suffix}')
if not stream_url:
continue
if stream_kind == '':
formats.extend(self._extract_m3u8_formats(
stream_url, display_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif stream_kind == 'hds':
formats.extend(self._extract_f4m_formats(
stream_url, display_id, f4m_id=stream_kind, fatal=False))
video_id = video.get('id') or display_id
description = video.get('description', {}).get('detailed')
duration = int_or_none(video.get('duration'))
series = video.get('show', {}).get('name')
season_number = int_or_none(video.get('season', {}).get('number'))
episode_number = int_or_none(video.get('episodeNumber'))
tags = video.get('tags')
age_limit = parse_age_limit(video.get('parental', {}).get('rating'))
subtitles = {}
captions = stream.get('captions')
if isinstance(captions, list):
for caption in captions:
subtitle_url = url_or_none(caption.get('fileUrl'))
if not subtitle_url or not subtitle_url.startswith('http'):
continue
lang = caption.get('fileLang', 'en')
ext = determine_ext(subtitle_url)
subtitles.setdefault(lang, []).append({
'url': subtitle_url,
'ext': 'ttml' if ext == 'xml' else ext,
})
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'duration': duration,
'series': series,
'season_number': season_number,
'episode_number': episode_number,
'tags': tags,
'age_limit': age_limit,
'formats': formats,
'subtitles': subtitles,
}
class DiscoveryGoIE(DiscoveryGoBaseIE):
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % r'(?:[^/]+/)+'
_GEO_COUNTRIES = ['US']
_TEST = {
'url': 'https://www.discoverygo.com/bering-sea-gold/reaper-madness/',
'info_dict': {
'id': '58c167d86b66d12f2addeb01',
'ext': 'mp4',
'title': 'Reaper Madness',
'description': 'md5:09f2c625c99afb8946ed4fb7865f6e78',
'duration': 2519,
'series': 'Bering Sea Gold',
'season_number': 8,
'episode_number': 6,
'age_limit': 14,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
container = extract_attributes(
self._search_regex(
r'(<div[^>]+class=["\']video-player-container[^>]+>)',
webpage, 'video container'))
video = self._parse_json(
container.get('data-video') or container.get('data-json'),
display_id)
stream = video.get('stream')
return self._extract_video_info(video, stream, display_id)
class DiscoveryGoPlaylistIE(DiscoveryGoBaseIE):
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % ''
_TEST = {
'url': 'https://www.discoverygo.com/bering-sea-gold/',
'info_dict': {
'id': 'bering-sea-gold',
'title': 'Bering Sea Gold',
'description': 'md5:cc5c6489835949043c0cc3ad66c2fa0e',
},
'playlist_mincount': 6,
}
@classmethod
def suitable(cls, url):
return False if DiscoveryGoIE.suitable(url) else super().suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
entries = []
for mobj in re.finditer(r'data-json=(["\'])(?P<json>{.+?})\1', webpage):
data = self._parse_json(
mobj.group('json'), display_id,
transform_source=unescapeHTML, fatal=False)
if not isinstance(data, dict) or data.get('type') != 'episode':
continue
episode_url = data.get('socialUrl')
if not episode_url:
continue
entries.append(self.url_result(
episode_url, ie=DiscoveryGoIE.ie_key(),
video_id=data.get('id')))
return self.playlist_result(
entries, display_id,
remove_end(self._og_search_title(
webpage, fatal=False), ' | Discovery GO'),
self._og_search_description(webpage))

View File

@@ -24,8 +24,9 @@ from ..utils import (
class DouyuBaseIE(InfoExtractor):
def _download_cryptojs_md5(self, video_id):
for url in [
# XXX: Do NOT use cdn.bootcdn.net; ref: https://sansec.io/research/polyfill-supply-chain-attack
'https://cdnjs.cloudflare.com/ajax/libs/crypto-js/3.1.2/rollups/md5.js',
'https://cdn.bootcdn.net/ajax/libs/crypto-js/3.1.2/rollups/md5.js',
'https://unpkg.com/cryptojslib@3.1.2/rollups/md5.js',
]:
js_code = self._download_webpage(
url, video_id, note='Downloading signing dependency', fatal=False)
@@ -35,7 +36,8 @@ class DouyuBaseIE(InfoExtractor):
raise ExtractorError('Unable to download JS dependency (crypto-js/md5)')
def _get_cryptojs_md5(self, video_id):
return self.cache.load('douyu', 'crypto-js-md5') or self._download_cryptojs_md5(video_id)
return self.cache.load(
'douyu', 'crypto-js-md5', min_ver='2024.07.04') or self._download_cryptojs_md5(video_id)
def _calc_sign(self, sign_func, video_id, a):
b = uuid.uuid4().hex

View File

@@ -319,35 +319,17 @@ class DPlayIE(DPlayBaseIE):
url, display_id, host, 'dplay' + country, country, domain)
class HGTVDeIE(DPlayBaseIE):
_VALID_URL = r'https?://de\.hgtv\.com/sendungen' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://de.hgtv.com/sendungen/tiny-house-klein-aber-oho/wer-braucht-schon-eine-toilette/',
'info_dict': {
'id': '151205',
'display_id': 'tiny-house-klein-aber-oho/wer-braucht-schon-eine-toilette',
'ext': 'mp4',
'title': 'Wer braucht schon eine Toilette',
'description': 'md5:05b40a27e7aed2c9172de34d459134e2',
'duration': 1177.024,
'timestamp': 1595705400,
'upload_date': '20200725',
'creator': 'HGTV',
'series': 'Tiny House - klein, aber oho',
'season_number': 3,
'episode_number': 3,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
return self._get_disco_api_info(
url, display_id, 'eu1-prod.disco-api.com', 'hgtv', 'de')
class DiscoveryPlusBaseIE(DPlayBaseIE):
"""Subclasses must set _PRODUCT, _DISCO_API_PARAMS"""
_DISCO_CLIENT_VER = '27.43.0'
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers['x-disco-client'] = f'WEB:UNKNOWN:{self._PRODUCT}:25.2.6'
headers.update({
'x-disco-params': f'realm={realm},siteLookupKey={self._PRODUCT}',
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:{self._DISCO_CLIENT_VER}',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
def _download_video_playback_info(self, disco_base, video_id, headers):
return self._download_json(
@@ -365,9 +347,68 @@ class DiscoveryPlusBaseIE(DPlayBaseIE):
return self._get_disco_api_info(url, self._match_id(url), **self._DISCO_API_PARAMS)
class HGTVDeIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://de\.hgtv\.com/sendungen' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://de.hgtv.com/sendungen/mein-kleinstadt-traumhaus/vom-landleben-ins-loft',
'info_dict': {
'id': '7332936',
'ext': 'mp4',
'display_id': 'mein-kleinstadt-traumhaus/vom-landleben-ins-loft',
'title': 'Vom Landleben ins Loft',
'description': 'md5:e5f72c02c853970796dd3818f2e25745',
'episode': 'Episode 7',
'episode_number': 7,
'season': 'Season 7',
'season_number': 7,
'series': 'Mein Kleinstadt-Traumhaus',
'duration': 2645.0,
'timestamp': 1725998100,
'upload_date': '20240910',
'creators': ['HGTV'],
'tags': [],
'thumbnail': 'https://eu1-prod-images.disco-api.com/2024/08/09/82a386b9-c688-32c7-b9ff-0b13865f0bae.jpeg',
},
}]
_PRODUCT = 'hgtv'
_DISCO_API_PARAMS = {
'disco_host': 'eu1-prod.disco-api.com',
'realm': 'hgtv',
'country': 'de',
}
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm}',
'x-disco-client': 'Alps:HyogaPlayer:0.0.0',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
class GoDiscoveryIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:go\.)?discovery\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://go.discovery.com/video/in-the-eye-of-the-storm-discovery-atve-us/trapped-in-a-twister',
'info_dict': {
'id': '5352642',
'display_id': 'in-the-eye-of-the-storm-discovery-atve-us/trapped-in-a-twister',
'ext': 'mp4',
'title': 'Trapped in a Twister',
'description': 'Twisters destroy Midwest towns, trapping spotters in the eye of the storm.',
'episode_number': 1,
'episode': 'Episode 1',
'season_number': 1,
'season': 'Season 1',
'series': 'In The Eye Of The Storm',
'duration': 2490.237,
'upload_date': '20240715',
'timestamp': 1721008800,
'tags': [],
'creators': ['Discovery'],
'thumbnail': 'https://us1-prod-images.disco-api.com/2024/07/10/5e39637d-cabf-3ab3-8e9a-f4e9d37bc036.jpeg',
},
}, {
'url': 'https://go.discovery.com/video/dirty-jobs-discovery-atve-us/rodbuster-galvanizer',
'info_dict': {
'id': '4164906',
@@ -395,6 +436,26 @@ class GoDiscoveryIE(DiscoveryPlusBaseIE):
class TravelChannelIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?travelchannel\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.travelchannel.com/video/the-dead-files-travel-channel/protect-the-children',
'info_dict': {
'id': '4710177',
'display_id': 'the-dead-files-travel-channel/protect-the-children',
'ext': 'mp4',
'title': 'Protect the Children',
'description': 'An evil presence threatens an Ohio woman\'s children and marriage.',
'season_number': 14,
'season': 'Season 14',
'episode_number': 10,
'episode': 'Episode 10',
'series': 'The Dead Files',
'duration': 2550.481,
'timestamp': 1664510400,
'upload_date': '20220930',
'tags': [],
'creators': ['Travel Channel'],
'thumbnail': 'https://us1-prod-images.disco-api.com/2022/03/17/5e45eace-de5d-343a-9293-f400a2aa77d5.jpeg',
},
}, {
'url': 'https://watch.travelchannel.com/video/ghost-adventures-travel-channel/ghost-train-of-ely',
'info_dict': {
'id': '2220256',
@@ -422,6 +483,26 @@ class TravelChannelIE(DiscoveryPlusBaseIE):
class CookingChannelIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?cookingchanneltv\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.cookingchanneltv.com/video/bobbys-triple-threat-food-network-atve-us/titans-vs-marcus-samuelsson',
'info_dict': {
'id': '5350005',
'ext': 'mp4',
'display_id': 'bobbys-triple-threat-food-network-atve-us/titans-vs-marcus-samuelsson',
'title': 'Titans vs Marcus Samuelsson',
'description': 'Marcus Samuelsson throws his legendary global tricks at the Titans.',
'episode_number': 1,
'episode': 'Episode 1',
'season_number': 3,
'season': 'Season 3',
'series': 'Bobby\'s Triple Threat',
'duration': 2520.851,
'upload_date': '20240710',
'timestamp': 1720573200,
'tags': [],
'creators': ['Food Network'],
'thumbnail': 'https://us1-prod-images.disco-api.com/2024/07/04/529cd095-27ec-35c5-84e9-90ebd3e5d2da.jpeg',
},
}, {
'url': 'https://watch.cookingchanneltv.com/video/carnival-eats-cooking-channel/the-postman-always-brings-rice-2348634',
'info_dict': {
'id': '2348634',
@@ -449,6 +530,22 @@ class CookingChannelIE(DiscoveryPlusBaseIE):
class HGTVUsaIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?hgtv\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.hgtv.com/video/flip-or-flop-the-final-flip-hgtv-atve-us/flip-or-flop-the-final-flip',
'info_dict': {
'id': '5025585',
'display_id': 'flip-or-flop-the-final-flip-hgtv-atve-us/flip-or-flop-the-final-flip',
'ext': 'mp4',
'title': 'Flip or Flop: The Final Flip',
'description': 'Tarek and Christina are going their separate ways after one last flip!',
'series': 'Flip or Flop: The Final Flip',
'duration': 2580.644,
'upload_date': '20231101',
'timestamp': 1698811200,
'tags': [],
'creators': ['HGTV'],
'thumbnail': 'https://us1-prod-images.disco-api.com/2022/11/27/455caa6c-1462-3f14-b63d-a026d7a5e6d3.jpeg',
},
}, {
'url': 'https://watch.hgtv.com/video/home-inspector-joe-hgtv-atve-us/this-mold-house',
'info_dict': {
'id': '4289736',
@@ -476,6 +573,26 @@ class HGTVUsaIE(DiscoveryPlusBaseIE):
class FoodNetworkIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?foodnetwork\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.foodnetwork.com/video/guys-grocery-games-food-network/wild-in-the-aisles',
'info_dict': {
'id': '2152549',
'display_id': 'guys-grocery-games-food-network/wild-in-the-aisles',
'ext': 'mp4',
'title': 'Wild in the Aisles',
'description': 'The chefs make spaghetti and meatballs with "Out of Stock" ingredients.',
'season_number': 1,
'season': 'Season 1',
'episode_number': 1,
'episode': 'Episode 1',
'series': 'Guy\'s Grocery Games',
'tags': [],
'creators': ['Food Network'],
'duration': 2520.651,
'upload_date': '20230623',
'timestamp': 1687492800,
'thumbnail': 'https://us1-prod-images.disco-api.com/2022/06/15/37fb5333-cad2-3dbb-af7c-c20ec77c89c6.jpeg',
},
}, {
'url': 'https://watch.foodnetwork.com/video/kids-baking-championship-food-network/float-like-a-butterfly',
'info_dict': {
'id': '4116449',
@@ -503,6 +620,26 @@ class FoodNetworkIE(DiscoveryPlusBaseIE):
class DestinationAmericaIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?destinationamerica\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.destinationamerica.com/video/bbq-pit-wars-destination-america/smoke-on-the-water',
'info_dict': {
'id': '2218409',
'display_id': 'bbq-pit-wars-destination-america/smoke-on-the-water',
'ext': 'mp4',
'title': 'Smoke on the Water',
'description': 'The pitmasters head to Georgia for the Smoke on the Water BBQ Festival.',
'season_number': 2,
'season': 'Season 2',
'episode_number': 1,
'episode': 'Episode 1',
'series': 'BBQ Pit Wars',
'tags': [],
'creators': ['Destination America'],
'duration': 2614.878,
'upload_date': '20230623',
'timestamp': 1687492800,
'thumbnail': 'https://us1-prod-images.disco-api.com/2020/05/11/c0f8e85d-9a10-3e6f-8e43-f6faafa81ba2.jpeg',
},
}, {
'url': 'https://www.destinationamerica.com/video/alaska-monsters-destination-america-atve-us/central-alaskas-bigfoot',
'info_dict': {
'id': '4210904',
@@ -530,6 +667,26 @@ class DestinationAmericaIE(DiscoveryPlusBaseIE):
class InvestigationDiscoveryIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?investigationdiscovery\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.investigationdiscovery.com/video/deadly-influence-the-social-media-murders-investigation-discovery-atve-us/rip-bianca',
'info_dict': {
'id': '5341132',
'display_id': 'deadly-influence-the-social-media-murders-investigation-discovery-atve-us/rip-bianca',
'ext': 'mp4',
'title': 'RIP Bianca',
'description': 'A teenage influencer discovers an online world of threat, harm and danger.',
'season_number': 1,
'season': 'Season 1',
'episode_number': 3,
'episode': 'Episode 3',
'series': 'Deadly Influence: The Social Media Murders',
'creators': ['Investigation Discovery'],
'tags': [],
'duration': 2490.888,
'upload_date': '20240618',
'timestamp': 1718672400,
'thumbnail': 'https://us1-prod-images.disco-api.com/2024/06/15/b567c774-9e44-3c6c-b0ba-db860a73e812.jpeg',
},
}, {
'url': 'https://www.investigationdiscovery.com/video/unmasked-investigation-discovery/the-killer-clown',
'info_dict': {
'id': '2139409',
@@ -557,6 +714,26 @@ class InvestigationDiscoveryIE(DiscoveryPlusBaseIE):
class AmHistoryChannelIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?ahctv\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.ahctv.com/video/blood-and-fury-americas-civil-war-ahc/battle-of-bull-run',
'info_dict': {
'id': '2139199',
'display_id': 'blood-and-fury-americas-civil-war-ahc/battle-of-bull-run',
'ext': 'mp4',
'title': 'Battle of Bull Run',
'description': 'Two untested armies clash in the first real battle of the Civil War.',
'season_number': 1,
'season': 'Season 1',
'episode_number': 1,
'episode': 'Episode 1',
'series': 'Blood and Fury: America\'s Civil War',
'duration': 2612.509,
'upload_date': '20220923',
'timestamp': 1663905600,
'creators': ['AHC'],
'tags': [],
'thumbnail': 'https://us1-prod-images.disco-api.com/2020/05/11/4af61bd7-d705-3108-82c4-1a6e541e20fa.jpeg',
},
}, {
'url': 'https://www.ahctv.com/video/modern-sniper-ahc/army',
'info_dict': {
'id': '2309730',
@@ -584,6 +761,26 @@ class AmHistoryChannelIE(DiscoveryPlusBaseIE):
class ScienceChannelIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?sciencechannel\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.sciencechannel.com/video/spaces-deepest-secrets-science-atve-us/mystery-of-the-dead-planets',
'info_dict': {
'id': '2347335',
'display_id': 'spaces-deepest-secrets-science-atve-us/mystery-of-the-dead-planets',
'ext': 'mp4',
'title': 'Mystery of the Dead Planets',
'description': 'Astronomers unmask the truly destructive nature of the cosmos.',
'season_number': 7,
'season': 'Season 7',
'episode_number': 1,
'episode': 'Episode 1',
'series': 'Space\'s Deepest Secrets',
'duration': 2524.989,
'upload_date': '20230128',
'timestamp': 1674882000,
'creators': ['Science'],
'tags': [],
'thumbnail': 'https://us1-prod-images.disco-api.com/2021/03/30/3796829d-aead-3f9a-bd8d-e49048b3cdca.jpeg',
},
}, {
'url': 'https://www.sciencechannel.com/video/strangest-things-science-atve-us/nazi-mystery-machine',
'info_dict': {
'id': '2842849',
@@ -608,36 +805,29 @@ class ScienceChannelIE(DiscoveryPlusBaseIE):
}
class DIYNetworkIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?diynetwork\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.diynetwork.com/video/pool-kings-diy-network/bringing-beach-life-to-texas',
'info_dict': {
'id': '2309730',
'display_id': 'pool-kings-diy-network/bringing-beach-life-to-texas',
'ext': 'mp4',
'title': 'Bringing Beach Life to Texas',
'description': 'The Pool Kings give a family a day at the beach in their own backyard.',
'season_number': 10,
'episode_number': 2,
},
'skip': 'Available for Premium users',
}, {
'url': 'https://watch.diynetwork.com/video/pool-kings-diy-network/bringing-beach-life-to-texas',
'only_matching': True,
}]
_PRODUCT = 'diy'
_DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.watch.diynetwork.com',
'realm': 'go',
'country': 'us',
}
class DiscoveryLifeIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoverylife\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.discoverylife.com/video/er-files-discovery-life-atve-us/sweet-charity',
'info_dict': {
'id': '2347614',
'display_id': 'er-files-discovery-life-atve-us/sweet-charity',
'ext': 'mp4',
'title': 'Sweet Charity',
'description': 'The staff at Charity Hospital treat a serious foot infection.',
'season_number': 1,
'season': 'Season 1',
'episode_number': 1,
'episode': 'Episode 1',
'series': 'ER Files',
'duration': 2364.261,
'upload_date': '20230721',
'timestamp': 1689912000,
'creators': ['Discovery Life'],
'tags': [],
'thumbnail': 'https://us1-prod-images.disco-api.com/2021/03/16/4b6f0124-360b-3546-b6a4-5552db886b86.jpeg',
},
}, {
'url': 'https://www.discoverylife.com/video/surviving-death-discovery-life-atve-us/bodily-trauma',
'info_dict': {
'id': '2218238',
@@ -665,6 +855,26 @@ class DiscoveryLifeIE(DiscoveryPlusBaseIE):
class AnimalPlanetIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?animalplanet\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.animalplanet.com/video/mysterious-creatures-with-forrest-galante-animal-planet-atve-us/the-demon-of-peru',
'info_dict': {
'id': '4650835',
'display_id': 'mysterious-creatures-with-forrest-galante-animal-planet-atve-us/the-demon-of-peru',
'ext': 'mp4',
'title': 'The Demon of Peru',
'description': 'In Peru, a farming village is being terrorized by a “man-like beast.”',
'season_number': 1,
'season': 'Season 1',
'episode_number': 4,
'episode': 'Episode 4',
'series': 'Mysterious Creatures with Forrest Galante',
'duration': 2490.488,
'upload_date': '20230111',
'timestamp': 1673413200,
'creators': ['Animal Planet'],
'tags': [],
'thumbnail': 'https://us1-prod-images.disco-api.com/2022/03/01/6dbaa833-9a2e-3fee-9381-c19eddf67c0c.jpeg',
},
}, {
'url': 'https://www.animalplanet.com/video/north-woods-law-animal-planet/squirrel-showdown',
'info_dict': {
'id': '3338923',
@@ -692,6 +902,26 @@ class AnimalPlanetIE(DiscoveryPlusBaseIE):
class TLCIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:go\.)?tlc\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://go.tlc.com/video/90-day-the-last-resort-tlc-atve-us/the-last-chance',
'info_dict': {
'id': '5186422',
'display_id': '90-day-the-last-resort-tlc-atve-us/the-last-chance',
'ext': 'mp4',
'title': 'The Last Chance',
'description': 'Infidelity shakes Kalani and Asuelu\'s world, and Angela threatens divorce.',
'season_number': 1,
'season': 'Season 1',
'episode_number': 1,
'episode': 'Episode 1',
'series': '90 Day: The Last Resort',
'duration': 5123.91,
'upload_date': '20230815',
'timestamp': 1692061200,
'creators': ['TLC'],
'tags': [],
'thumbnail': 'https://us1-prod-images.disco-api.com/2023/08/08/0ee367e2-ac76-334d-bf23-dbf796696a24.jpeg',
},
}, {
'url': 'https://go.tlc.com/video/my-600-lb-life-tlc/melissas-story-part-1',
'info_dict': {
'id': '2206540',
@@ -716,93 +946,8 @@ class TLCIE(DiscoveryPlusBaseIE):
}
class MotorTrendIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:watch\.)?motortrend\.com/video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://watch.motortrend.com/video/car-issues-motortrend-atve-us/double-dakotas',
'info_dict': {
'id': '"4859182"',
'display_id': 'double-dakotas',
'ext': 'mp4',
'title': 'Double Dakotas',
'description': 'Tylers buy-one-get-one Dakota deal has the Wizard pulling double duty.',
'season_number': 2,
'episode_number': 3,
},
'skip': 'Available for Premium users',
}, {
'url': 'https://watch.motortrend.com/video/car-issues-motortrend-atve-us/double-dakotas',
'only_matching': True,
}]
_PRODUCT = 'vel'
_DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.watch.motortrend.com',
'realm': 'go',
'country': 'us',
}
class MotorTrendOnDemandIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?motortrend(?:ondemand\.com|\.com/plus)/detail' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.motortrendondemand.com/detail/wheelstanding-dump-truck-stubby-bobs-comeback/37699/784',
'info_dict': {
'id': '37699',
'display_id': 'wheelstanding-dump-truck-stubby-bobs-comeback/37699',
'ext': 'mp4',
'title': 'Wheelstanding Dump Truck! Stubby Bobs Comeback',
'description': 'md5:996915abe52a1c3dfc83aecea3cce8e7',
'season_number': 5,
'episode_number': 52,
'episode': 'Episode 52',
'season': 'Season 5',
'thumbnail': r're:^https?://.+\.jpe?g$',
'timestamp': 1388534401,
'duration': 1887.345,
'creator': 'Originals',
'series': 'Roadkill',
'upload_date': '20140101',
'tags': [],
},
}, {
'url': 'https://www.motortrend.com/plus/detail/roadworthy-rescues-teaser-trailer/4922860/',
'info_dict': {
'id': '4922860',
'ext': 'mp4',
'title': 'Roadworthy Rescues | Teaser Trailer',
'description': 'Derek Bieri helps Freiburger and Finnegan with their \'68 big-block Dart.',
'display_id': 'roadworthy-rescues-teaser-trailer/4922860',
'creator': 'Originals',
'series': 'Roadworthy Rescues',
'thumbnail': r're:^https?://.+\.jpe?g$',
'upload_date': '20220907',
'timestamp': 1662523200,
'duration': 1066.356,
'tags': [],
},
}, {
'url': 'https://www.motortrend.com/plus/detail/ugly-duckling/2450033/12439',
'only_matching': True,
}]
_PRODUCT = 'MTOD'
_DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.motortrendondemand.com',
'realm': 'motortrend',
'country': 'us',
}
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm}',
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:4.39.1-gi1',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
class DiscoveryPlusIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:\w{2}/)?video' + DPlayBaseIE._PATH_REGEX
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:(?P<country>[a-z]{2})/)?video(?:/sport|/olympics)?' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.discoveryplus.com/video/property-brothers-forever-home/food-and-family',
'info_dict': {
@@ -823,14 +968,45 @@ class DiscoveryPlusIE(DiscoveryPlusBaseIE):
}, {
'url': 'https://discoveryplus.com/ca/video/bering-sea-gold-discovery-ca/goldslingers',
'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/gb/video/sport/eurosport-1-british-eurosport-1-british-sport/6-hours-of-spa-review',
'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/gb/video/olympics/dplus-sport-dplus-sport-sport/rugby-sevens-australia-samoa',
'only_matching': True,
}]
_PRODUCT = 'dplus_us'
_DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.discoveryplus.com',
'realm': 'go',
'country': 'us',
}
_PRODUCT = None
_DISCO_API_PARAMS = None
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm},siteLookupKey={self._PRODUCT}',
'x-disco-client': f'WEB:UNKNOWN:dplus_us:{self._DISCO_CLIENT_VER}',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
def _real_extract(self, url):
video_id, country = self._match_valid_url(url).group('id', 'country')
if not country:
country = 'us'
self._PRODUCT = f'dplus_{country}'
if country in ('br', 'ca', 'us'):
self._DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.discoveryplus.com',
'realm': 'go',
'country': country,
}
else:
self._DISCO_API_PARAMS = {
'disco_host': 'eu1-prod-direct.discoveryplus.com',
'realm': 'dplay',
'country': country,
}
return self._get_disco_api_info(url, video_id, **self._DISCO_API_PARAMS)
class DiscoveryPlusIndiaIE(DiscoveryPlusBaseIE):
@@ -984,16 +1160,22 @@ class DiscoveryPlusShowBaseIE(DPlayBaseIE):
class DiscoveryPlusItalyIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/it/video' + DPlayBaseIE._PATH_REGEX
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/it/video(?:/sport|/olympics)?' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.discoveryplus.com/it/video/i-signori-della-neve/stagione-2-episodio-1-i-preparativi',
'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/it/video/super-benny/trailer',
'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/it/video/olympics/dplus-sport-dplus-sport-sport/water-polo-greece-italy',
'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/it/video/sport/dplus-sport-dplus-sport-sport/lisa-vittozzi-allinferno-e-ritorno',
'only_matching': True,
}]
_PRODUCT = 'dplus_us'
_PRODUCT = 'dplus_it'
_DISCO_API_PARAMS = {
'disco_host': 'eu1-prod-direct.discoveryplus.com',
'realm': 'dplay',
@@ -1002,8 +1184,8 @@ class DiscoveryPlusItalyIE(DiscoveryPlusBaseIE):
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm}',
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:25.2.6',
'x-disco-params': f'realm={realm},siteLookupKey={self._PRODUCT}',
'x-disco-client': f'WEB:UNKNOWN:dplus_us:{self._DISCO_CLIENT_VER}',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
@@ -1044,39 +1226,3 @@ class DiscoveryPlusIndiaShowIE(DiscoveryPlusShowBaseIE):
_SHOW_STR = 'show'
_INDEX = 4
_VIDEO_IE = DiscoveryPlusIndiaIE
class GlobalCyclingNetworkPlusIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://plus\.globalcyclingnetwork\.com/watch/(?P<id>\d+)'
_TESTS = [{
'url': 'https://plus.globalcyclingnetwork.com/watch/1397691',
'info_dict': {
'id': '1397691',
'ext': 'mp4',
'title': 'The Athertons: Mountain Biking\'s Fastest Family',
'description': 'md5:75a81937fcd8b989eec6083a709cd837',
'thumbnail': 'https://us1-prod-images.disco-api.com/2021/03/04/eb9e3026-4849-3001-8281-9356466f0557.png',
'series': 'gcn',
'creator': 'Gcn',
'upload_date': '20210309',
'timestamp': 1615248000,
'duration': 2531.0,
'tags': [],
},
'skip': 'Subscription required',
'params': {'skip_download': 'm3u8'},
}]
_PRODUCT = 'web'
_DISCO_API_PARAMS = {
'disco_host': 'disco-api-prod.globalcyclingnetwork.com',
'realm': 'gcn',
'country': 'us',
}
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm}',
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:27.3.2',
'Authorization': self._get_auth(disco_base, display_id, realm),
})

View File

@@ -6,8 +6,10 @@ import urllib.parse
from .common import InfoExtractor
from ..utils import (
ExtractorError,
update_url,
update_url_query,
url_basename,
urlencode_postdata,
)
@@ -36,43 +38,58 @@ class DropboxIE(InfoExtractor):
},
]
def _yield_decoded_parts(self, webpage):
for encoded in reversed(re.findall(r'registerStreamedPrefetch\s*\(\s*"[\w/+=]+"\s*,\s*"([\w/+=]+)"', webpage)):
yield base64.b64decode(encoded).decode('utf-8', 'ignore')
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
fn = urllib.parse.unquote(url_basename(url))
title = os.path.splitext(fn)[0]
password = self.get_param('videopassword')
if (self._og_search_title(webpage) == 'Dropbox - Password Required'
or 'Enter the password for this link' in webpage):
for part in self._yield_decoded_parts(webpage):
if '/sm/password' in part:
webpage = self._download_webpage(
update_url('https://www.dropbox.com/sm/password', query=part.partition('?')[2]), video_id)
break
if (self._og_search_title(webpage, default=None) == 'Dropbox - Password Required'
or 'Enter the password for this link' in webpage):
if password:
content_id = self._search_regex(r'content_id=(.*?)["\']', webpage, 'content_id')
payload = f'is_xhr=true&t={self._get_cookies("https://www.dropbox.com").get("t").value}&content_id={content_id}&password={password}&url={url}'
response = self._download_json(
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password', data=payload.encode(),
headers={'content-type': 'application/x-www-form-urlencoded; charset=UTF-8'})
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password',
headers={'content-type': 'application/x-www-form-urlencoded; charset=UTF-8'},
data=urlencode_postdata({
'is_xhr': 'true',
't': self._get_cookies('https://www.dropbox.com')['t'].value,
'content_id': self._search_regex(r'content_id=([\w.+=/-]+)["\']', webpage, 'content id'),
'password': password,
'url': url,
}))
if response.get('status') != 'authed':
raise ExtractorError('Authentication failed!', expected=True)
webpage = self._download_webpage(url, video_id)
elif self._get_cookies('https://dropbox.com').get('sm_auth'):
webpage = self._download_webpage(url, video_id)
else:
raise ExtractorError('Invalid password', expected=True)
elif not self._get_cookies('https://dropbox.com').get('sm_auth'):
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
webpage = self._download_webpage(url, video_id)
formats, subtitles, has_anonymous_download = [], {}, False
for encoded in reversed(re.findall(r'registerStreamedPrefetch\s*\(\s*"[\w/+=]+"\s*,\s*"([\w/+=]+)"', webpage)):
decoded = base64.b64decode(encoded).decode('utf-8', 'ignore')
formats, subtitles = [], {}
has_anonymous_download = False
thumbnail = None
for part in self._yield_decoded_parts(webpage):
if not has_anonymous_download:
has_anonymous_download = self._search_regex(
r'(anonymous:\tanonymous)', decoded, 'anonymous', default=False)
r'(anonymous:\tanonymous)', part, 'anonymous', default=False)
transcode_url = self._search_regex(
r'\n.(https://[^\x03\x08\x12\n]+\.m3u8)', decoded, 'transcode url', default=None)
r'\n.(https://[^\x03\x08\x12\n]+\.m3u8)', part, 'transcode url', default=None)
if not transcode_url:
continue
formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id, 'mp4')
thumbnail = self._search_regex(
r'(https://www\.dropbox\.com/temp_thumb_from_token/[\w/?&=]+)', part, 'thumbnail', default=None)
break
# downloads enabled we can get the original file
@@ -89,4 +106,5 @@ class DropboxIE(InfoExtractor):
'title': title,
'formats': formats,
'subtitles': subtitles,
'thumbnail': thumbnail,
}

View File

@@ -2,6 +2,7 @@ from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
join_nonempty,
orderedSet,
parse_iso8601,
parse_qs,
@@ -13,7 +14,7 @@ from ..utils import (
class EpidemicSoundIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?epidemicsound\.com/track/(?P<id>[0-9a-zA-Z]+)'
_VALID_URL = r'https?://(?:www\.)?epidemicsound\.com/(?:(?P<sfx>sound-effects/tracks)|track)/(?P<id>[0-9a-zA-Z-]+)'
_TESTS = [{
'url': 'https://www.epidemicsound.com/track/yFfQVRpSPz/',
'md5': 'd98ff2ddb49e8acab9716541cbc9dfac',
@@ -47,6 +48,20 @@ class EpidemicSoundIE(InfoExtractor):
'release_timestamp': 1700535606,
'release_date': '20231121',
},
}, {
'url': 'https://www.epidemicsound.com/sound-effects/tracks/2f02f54b-9faa-4daf-abac-1cfe9e9cef69/',
'md5': '35d7cf05bd8b614a84f0495a05de9388',
'info_dict': {
'id': '208931',
'ext': 'mp3',
'upload_date': '20240603',
'timestamp': 1717436529,
'categories': ['appliance'],
'display_id': '6b2NXLURPr',
'duration': 1.0,
'title': 'Oven, Grill, Door Open 01',
'thumbnail': 'https://cdn.epidemicsound.com/curation-assets/commercial-release-cover-images/default-sfx/3000x3000.jpg',
},
}]
@staticmethod
@@ -77,8 +92,10 @@ class EpidemicSoundIE(InfoExtractor):
return f
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(f'https://www.epidemicsound.com/json/track/{video_id}', video_id)
video_id, is_sfx = self._match_valid_url(url).group('id', 'sfx')
json_data = self._download_json(join_nonempty(
'https://www.epidemicsound.com/json/track',
is_sfx and 'kosmos-id', video_id, delim='/'), video_id)
thumbnails = traverse_obj(json_data, [('imageUrl', 'cover')])
thumb_base_url = traverse_obj(json_data, ('coverArt', 'baseUrl', {url_or_none}))

View File

@@ -17,6 +17,7 @@ from ..utils import (
url_or_none,
variadic,
)
from ..utils.traversal import traverse_obj
class ERTFlixBaseIE(InfoExtractor):
@@ -74,29 +75,28 @@ class ERTFlixCodenameIE(ERTFlixBaseIE):
def _extract_formats_and_subs(self, video_id):
media_info = self._call_api(video_id, codename=video_id)
formats, subs = [], {}
for media_file in try_get(media_info, lambda x: x['MediaFiles'], list) or []:
for media in try_get(media_file, lambda x: x['Formats'], list) or []:
fmt_url = url_or_none(try_get(media, lambda x: x['Url']))
if not fmt_url:
continue
ext = determine_ext(fmt_url)
if ext == 'm3u8':
formats_, subs_ = self._extract_m3u8_formats_and_subtitles(
fmt_url, video_id, m3u8_id='hls', ext='mp4', fatal=False)
elif ext == 'mpd':
formats_, subs_ = self._extract_mpd_formats_and_subtitles(
fmt_url, video_id, mpd_id='dash', fatal=False)
else:
formats.append({
'url': fmt_url,
'format_id': str_or_none(media.get('Id')),
})
continue
formats.extend(formats_)
self._merge_subtitles(subs_, target=subs)
formats, subtitles = [], {}
for media in traverse_obj(media_info, (
'MediaFiles', lambda _, v: v['RoleCodename'] == 'main',
'Formats', lambda _, v: url_or_none(v['Url']))):
fmt_url = media['Url']
ext = determine_ext(fmt_url)
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
fmt_url, video_id, m3u8_id='hls', ext='mp4', fatal=False)
elif ext == 'mpd':
fmts, subs = self._extract_mpd_formats_and_subtitles(
fmt_url, video_id, mpd_id='dash', fatal=False)
else:
formats.append({
'url': fmt_url,
'format_id': str_or_none(media.get('Id')),
})
continue
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return formats, subs
return formats, subtitles
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -294,37 +294,37 @@ class ESPNCricInfoIE(InfoExtractor):
class WatchESPNIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?espn\.com/(?:watch|espnplus)/player/_/id/(?P<id>[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})'
_TESTS = [{
'url': 'https://www.espn.com/watch/player/_/id/dbbc6b1d-c084-4b47-9878-5f13c56ce309',
'url': 'https://www.espn.com/watch/player/_/id/11ce417a-6ac9-42b6-8a15-46aeb9ad5710',
'info_dict': {
'id': 'dbbc6b1d-c084-4b47-9878-5f13c56ce309',
'id': '11ce417a-6ac9-42b6-8a15-46aeb9ad5710',
'ext': 'mp4',
'title': 'Huddersfield vs. Burnley',
'duration': 7500,
'thumbnail': 'https://artwork.api.espn.com/artwork/collections/media/dbbc6b1d-c084-4b47-9878-5f13c56ce309/default?width=640&apikey=1ngjw23osgcis1i1vbj96lmfqs',
'title': 'Abilene Chrstn vs. Texas Tech',
'duration': 14166,
'thumbnail': 'https://s.secure.espncdn.com/stitcher/artwork/collections/media/11ce417a-6ac9-42b6-8a15-46aeb9ad5710/16x9.jpg?timestamp=202407252343&showBadge=true&cb=12&package=ESPN_PLUS',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.espn.com/watch/player/_/id/a049a56e-a7ce-477e-aef3-c7e48ef8221c',
'url': 'https://www.espn.com/watch/player/_/id/90a2c85d-75e0-4b1e-a878-8e428a3cb2f3',
'info_dict': {
'id': 'a049a56e-a7ce-477e-aef3-c7e48ef8221c',
'id': '90a2c85d-75e0-4b1e-a878-8e428a3cb2f3',
'ext': 'mp4',
'title': 'Dynamo Dresden vs. VfB Stuttgart (Round #1) (German Cup)',
'duration': 8335,
'thumbnail': 'https://s.secure.espncdn.com/stitcher/artwork/collections/media/bd1f3d12-0654-47d9-852e-71b85ea695c7/16x9.jpg?timestamp=202201112217&showBadge=true&cb=12&package=ESPN_PLUS',
'title': 'UC Davis vs. California',
'duration': 9547,
'thumbnail': 'https://artwork.api.espn.com/artwork/collections/media/90a2c85d-75e0-4b1e-a878-8e428a3cb2f3/default?width=640&apikey=1ngjw23osgcis1i1vbj96lmfqs',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.espn.com/espnplus/player/_/id/317f5fd1-c78a-4ebe-824a-129e0d348421',
'url': 'https://www.espn.com/watch/player/_/id/c4313bbe-95b5-4bb8-b251-ac143ea0fc54',
'info_dict': {
'id': '317f5fd1-c78a-4ebe-824a-129e0d348421',
'id': 'c4313bbe-95b5-4bb8-b251-ac143ea0fc54',
'ext': 'mp4',
'title': 'The Wheel - Episode 10',
'duration': 3352,
'thumbnail': 'https://s.secure.espncdn.com/stitcher/artwork/collections/media/317f5fd1-c78a-4ebe-824a-129e0d348421/16x9.jpg?timestamp=202205031523&showBadge=true&cb=12&package=ESPN_PLUS',
'title': 'The College Football Show',
'duration': 3639,
'thumbnail': 'https://artwork.api.espn.com/artwork/collections/media/c4313bbe-95b5-4bb8-b251-ac143ea0fc54/default?width=640&apikey=1ngjw23osgcis1i1vbj96lmfqs',
},
'params': {
'skip_download': True,
@@ -353,6 +353,13 @@ class WatchESPNIE(AdobePassIE):
if not cookie:
self.raise_login_required(method='cookies')
jwt = self._search_regex(r'=([^|]+)\|', cookie.value, 'cookie jwt')
id_token = self._download_json(
'https://registerdisney.go.com/jgc/v6/client/ESPN-ONESITE.WEB-PROD/guest/refresh-auth',
None, 'Refreshing token', headers={'Content-Type': 'application/json'}, data=json.dumps({
'refreshToken': json.loads(base64.urlsafe_b64decode(f'{jwt}==='))['refresh_token'],
}).encode())['data']['token']['id_token']
assertion = self._call_bamgrid_api(
'devices', video_id,
headers={'Content-Type': 'application/json; charset=UTF-8'},
@@ -371,7 +378,7 @@ class WatchESPNIE(AdobePassIE):
})['access_token']
assertion = self._call_bamgrid_api(
'accounts/grant', video_id, payload={'id_token': cookie.value.split('|')[1]},
'accounts/grant', video_id, payload={'id_token': id_token},
headers={
'Authorization': token,
'Content-Type': 'application/json; charset=UTF-8',

View File

@@ -3,7 +3,12 @@ from ..utils import traverse_obj
class EurosportIE(InfoExtractor):
_VALID_URL = r'https?://www\.eurosport\.com/\w+/(?:[\w-]+/[\d-]+/)?[\w-]+_(?P<id>vid\d+)'
_VALID_URL = r'''(?x)
https?://(?:
(?:(?:www|espanol)\.)?eurosport\.(?:com(?:\.tr)?|de|dk|es|fr|hu|it|nl|no|ro)|
eurosport\.tvn24\.pl
)/[\w-]+/(?:[\w-]+/[\d-]+/)?[\w.-]+_(?P<id>vid\d+)
'''
_TESTS = [{
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/highlights-rafael-nadal-brushes-aside-caper-ruud-to-win-record-extending-14th-french-open-title_vid1694147/video.shtml',
'info_dict': {
@@ -70,6 +75,42 @@ class EurosportIE(InfoExtractor):
'duration': 105.0,
'upload_date': '20230518',
},
}, {
'url': 'https://www.eurosport.de/radsport/vuelta-a-espana/2024/vuelta-a-espana-2024-wout-van-aert-und-co.-verzweifeln-an-mcnulty-zeitfahr-krimi-in-lissabon_vid2219478/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.dk/speedway/mikkel-michelsen-misser-finalen-i-cardiff-se-danskeren-i-semifinalen-her_vid2219363/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.nl/mixed-martial-arts/ufc/2022/ufc-305-respect-tussen-adesanya-en-du-plessis_vid2219650/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.es/ciclismo/la-vuelta-2024-carlos-rodriguez-olvida-la-crono-y-ya-espera-que-llegue-la-montana-no-me-encontre-nada-comodo_vid2219682/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.fr/football/supercoupe-d-europe/2024-2025/kylian-mbappe-vinicius-junior-eduardo-camavinga-touche.-extraits-de-l-entrainement-du-real-madrid-en-video_vid2216993/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.it/calcio/serie-a/2024-2025/samardzic-a-bergamo-per-le-visite-mediche-con-l-atalanta_vid2219680/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.hu/kerekpar/vuelta-a-espana/2024/dramai-harc-a-masodpercekert-meglepetesgyoztes-a-vuelta-nyitoszakaszan_vid2219481/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.no/golf/fedex-st-jude-championship/2024/ligger-pa-andreplass-sa-skjer-dette-drama_vid30000618/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.no/golf/fedex-st-jude-championship/2024/ligger-pa-andreplass-sa-skjer-dette-drama_vid2219531/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.ro/tenis/western-southern-open-2/2024/rezumatul-partidei-dintre-zverev-si-shelton-de-la-cincinnati_vid2219657/video.shtml',
'only_matching': True,
}, {
'url': 'https://www.eurosport.com.tr/hentbol/olympic-games-paris-2024/2024/paris-2024-denmark-ile-germany-olimpiyatlarin-onemli-anlari_vid2215836/video.shtml',
'only_matching': True,
}, {
'url': 'https://eurosport.tvn24.pl/kolarstwo/tour-de-france-kobiet/2024/kasia-niewiadoma-przed-ostatnim-8.-etapem-tour-de-france-kobiet_vid2219765/video.shtml',
'only_matching': True,
}]
_TOKEN = None
@@ -77,6 +118,7 @@ class EurosportIE(InfoExtractor):
# actually defined in https://netsport.eurosport.io/?variables={"databaseId":<databaseId>,"playoutType":"VDP"}&extensions={"persistedQuery":{"version":1 ..
# but this method require to get sha256 hash
_GEO_COUNTRIES = ['DE', 'NL', 'EU', 'IT', 'FR'] # Not complete list but it should work
_GEO_BYPASS = False
def _real_initialize(self):
if EurosportIE._TOKEN is None:
@@ -98,13 +140,13 @@ class EurosportIE(InfoExtractor):
for stream_type in json_data['attributes']['streaming']:
if stream_type == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4')
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4', fatal=False)
elif stream_type == 'dash':
fmts, subs = self._extract_mpd_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, fatal=False)
elif stream_type == 'mss':
fmts, subs = self._extract_ism_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)

View File

@@ -84,7 +84,7 @@ class FacebookIE(InfoExtractor):
'timestamp': 1692346159,
'thumbnail': r're:^https?://.*',
'uploader_id': '100063551323670',
'duration': 3132.184,
'duration': 3133.583,
'view_count': int,
'concurrent_view_count': 0,
},
@@ -112,9 +112,10 @@ class FacebookIE(InfoExtractor):
'upload_date': '20140506',
'timestamp': 1399398998,
'thumbnail': r're:^https?://.*',
'uploader_id': 'pfbid028wxorhX2ErLFJ578N6P3crHD3PHmXTCqCvfBpsnbSLmbokwSY75p5hWBjHGkG4zxl',
'uploader_id': 'pfbid05AzrFTXgY37tqwaSgbFTTEpCLBjjEJHkigogwGiRPtKEpAsJYJpzE94H1RxYXWEtl',
'duration': 131.03,
'concurrent_view_count': int,
'view_count': int,
},
}, {
'note': 'Video with DASH manifest',
@@ -167,7 +168,7 @@ class FacebookIE(InfoExtractor):
# have 1080P, but only up to 720p in swf params
# data.video.story.attachments[].media
'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
'md5': 'ca63897a90c9452efee5f8c40d080e25',
'md5': '1659aa21fb3dd1585874f668e81a72c8',
'info_dict': {
'id': '10155529876156509',
'ext': 'mp4',
@@ -180,9 +181,10 @@ class FacebookIE(InfoExtractor):
'view_count': int,
'uploader_id': '100059479812265',
'concurrent_view_count': int,
'duration': 44.478,
'duration': 44.181,
},
}, {
# FIXME: unable to extract uploader, no formats found
# bigPipe.onPageletArrive ... onPageletArrive pagelet_group_mall
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.media
'url': 'https://www.facebook.com/yaroslav.korpan/videos/1417995061575415/',
@@ -241,9 +243,9 @@ class FacebookIE(InfoExtractor):
'timestamp': 1511548260,
'upload_date': '20171124',
'uploader': 'Vickie Gentry',
'uploader_id': 'pfbid0FuZhHCeWDAxWxEbr3yKPFaRstXvRxgsp9uCPG6GjD4J2AitB35NUAuJ4Q75KcjiDl',
'uploader_id': 'pfbid0FkkycT95ySNNyfCw4Cho6u5G7WbbZEcxT496Hq8rtx1K3LcTCATpR3wnyYhmyGC5l',
'thumbnail': r're:^https?://.*',
'duration': 148.435,
'duration': 148.224,
},
}, {
# data.node.comet_sections.content.story.attachments[].styles.attachment.media
@@ -271,7 +273,7 @@ class FacebookIE(InfoExtractor):
'description': 'Today Makkovik\'s own Pilot Mandy Smith made her inaugural landing on the airstrip in her hometown. What a proud moment as we all cheered and...',
'thumbnail': r're:^https?://.*',
'uploader': 'Lela Evans',
'uploader_id': 'pfbid0shZJipuigyy5mqrUJn9ub5LJFWNHvan5prtyi3LrDuuuJ4NwrURgnQHYR9fywBepl',
'uploader_id': 'pfbid0swT2y7t6TAsZVBvcyeYPdhTMefGaS26mzUwML3vd1ma6ndGZKxsyS4Ssu3jitZLXl',
'upload_date': '20231228',
'timestamp': 1703804085,
'duration': 394.347,
@@ -322,7 +324,7 @@ class FacebookIE(InfoExtractor):
'upload_date': '20180523',
'uploader': 'ESL One Dota 2',
'uploader_id': '100066514874195',
'duration': 4524.212,
'duration': 4524.001,
'view_count': int,
'thumbnail': r're:^https?://.*',
'concurrent_view_count': int,
@@ -339,9 +341,9 @@ class FacebookIE(InfoExtractor):
'title': 'Josef',
'thumbnail': r're:^https?://.*',
'concurrent_view_count': int,
'uploader_id': 'pfbid0cibUN6tV7DYgdbJdsUFN46wc4jKpVSPAvJQhFofGqBGmVn3V3JtAs2tfUwziw2hUl',
'uploader_id': 'pfbid02gpfwRM2XvdEJfsERupwQiNmBiDArc38RMRYZnap372q6Vs7MtFTVy72mmFWpJBTKl',
'timestamp': 1549275572,
'duration': 3.413,
'duration': 3.283,
'uploader': 'Josef Novak',
'description': '',
'upload_date': '20190204',
@@ -396,6 +398,7 @@ class FacebookIE(InfoExtractor):
'playlist_count': 1,
'skip': 'Requires logging in',
}, {
# FIXME: Cannot parse data error
# data.event.cover_media_renderer.cover_video
'url': 'https://m.facebook.com/events/1509582499515440',
'info_dict': {
@@ -498,7 +501,8 @@ class FacebookIE(InfoExtractor):
or get_first(post, ('video', 'creation_story', 'attachments', ..., 'media', lambda k, v: k == 'owner' and v['name']))
or get_first(post, (..., 'video', lambda k, v: k == 'owner' and v['name']))
or get_first(post, ('node', 'actors', ..., {dict}))
or get_first(post, ('event', 'event_creator', {dict})) or {})
or get_first(post, ('event', 'event_creator', {dict}))
or get_first(post, ('video', 'creation_story', 'short_form_video_context', 'video_owner', {dict})) or {})
uploader = uploader_data.get('name') or (
clean_html(get_element_by_id('fbPhotoPageAuthorName', webpage))
or self._search_regex(
@@ -524,6 +528,11 @@ class FacebookIE(InfoExtractor):
webpage, 'view count', default=None)),
'concurrent_view_count': get_first(post, (
('video', (..., ..., 'attachments', ..., 'media')), 'liveViewerCount', {int_or_none})),
**traverse_obj(post, (lambda _, v: video_id in v['url'], 'feedback', {
'like_count': ('likers', 'count', {int}),
'comment_count': ('total_comment_count', {int}),
'repost_count': ('share_count_reduced', {parse_count}),
}), get_all=False),
}
info_json_ld = self._search_json_ld(webpage, video_id, default={})
@@ -571,16 +580,21 @@ class FacebookIE(InfoExtractor):
# Formats larger than ~500MB will return error 403 unless chunk size is regulated
f.setdefault('downloader_options', {})['http_chunk_size'] = 250 << 20
def extract_relay_data(_filter):
return self._parse_json(self._search_regex(
rf'data-sjs>({{.*?{_filter}.*?}})</script>',
webpage, 'replay data', default='{}'), video_id, fatal=False) or {}
def yield_all_relay_data(_filter):
for relay_data in re.findall(rf'data-sjs>({{.*?{_filter}.*?}})</script>', webpage):
yield self._parse_json(relay_data, video_id, fatal=False) or {}
def extract_relay_prefetched_data(_filter):
return traverse_obj(extract_relay_data(_filter), (
'require', (None, (..., ..., ..., '__bbox', 'require')),
def extract_relay_data(_filter):
return next(filter(None, yield_all_relay_data(_filter)), {})
def extract_relay_prefetched_data(_filter, target_keys=None):
path = 'data'
if target_keys is not None:
path = lambda k, v: k == 'data' and any(target in v for target in variadic(target_keys))
return traverse_obj(yield_all_relay_data(_filter), (
..., 'require', (None, (..., ..., ..., '__bbox', 'require')),
lambda _, v: any(key.startswith('RelayPrefetchedStreamCache') for key in v),
..., ..., '__bbox', 'result', 'data', {dict}), get_all=False) or {}
..., ..., '__bbox', 'result', path, {dict}), get_all=False) or {}
if not video_data:
server_js_data = self._parse_json(self._search_regex([
@@ -591,7 +605,8 @@ class FacebookIE(InfoExtractor):
if not video_data:
data = extract_relay_prefetched_data(
r'"(?:dash_manifest|playable_url(?:_quality_hd)?)')
r'"(?:dash_manifest|playable_url(?:_quality_hd)?)',
target_keys=('video', 'event', 'nodes', 'node', 'mediaset'))
if data:
entries = []
@@ -926,18 +941,21 @@ class FacebookReelIE(InfoExtractor):
_TESTS = [{
'url': 'https://www.facebook.com/reel/1195289147628387',
'md5': 'f13dd37f2633595982db5ed8765474d3',
'md5': 'a53256d10fc2105441fe0c4212ed8cea',
'info_dict': {
'id': '1195289147628387',
'ext': 'mp4',
'title': 'md5:b05800b5b1ad56c0ca78bd3807b6a61e',
'description': 'md5:22f03309b216ac84720183961441d8db',
'uploader': 'md5:723e6cb3091241160f20b3c5dc282af1',
'title': r're:9\.6K views · 355 reactions .+ Let the “Slapathon” commence!! .+ LL COOL J · Mama Said Knock You Out$',
'description': r're:When your trying to help your partner .+ LL COOL J · Mama Said Knock You Out$',
'uploader': 'Beast Camp Training',
'uploader_id': '100040874179269',
'duration': 9.579,
'timestamp': 1637502609,
'upload_date': '20211121',
'thumbnail': r're:^https?://.*',
'like_count': int,
'comment_count': int,
'repost_count': int,
},
}]
@@ -957,6 +975,7 @@ class FacebookAdsIE(InfoExtractor):
'id': '899206155126718',
'ext': 'mp4',
'title': 'video by Kandao',
'description': 'md5:0822724069e3aca97cbed5dabbab282e',
'uploader': 'Kandao',
'uploader_id': '774114102743284',
'uploader_url': r're:^https?://.*',
@@ -965,6 +984,22 @@ class FacebookAdsIE(InfoExtractor):
'upload_date': '20231214',
'like_count': int,
},
}, {
# key 'watermarked_video_sd_url' missing
'url': 'https://www.facebook.com/ads/library/?id=501152689226254',
'info_dict': {
'id': '501152689226254',
'ext': 'mp4',
'title': 'video by mat.nawrocki',
'description': 'md5:02a446ace7ff8c3c37a2892922492490',
'uploader': 'mat.nawrocki',
'uploader_id': '148586968341456',
'uploader_url': r're:^https?://.*',
'timestamp': 1723452305,
'thumbnail': r're:^https?://.*',
'upload_date': '20240812',
'like_count': int,
},
}, {
'url': 'https://www.facebook.com/ads/library/?id=893637265423481',
'info_dict': {
@@ -1011,34 +1046,42 @@ class FacebookAdsIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
post_data = [self._parse_json(j, video_id, fatal=False)
for j in re.findall(r's\.handle\(({.*})\);requireLazy\(', webpage)]
data = traverse_obj(post_data, (
..., 'require', ..., ..., ..., 'props', 'deeplinkAdCard', 'snapshot', {dict}), get_all=False)
post_data = traverse_obj(
re.findall(r'data-sjs>({.*?ScheduledServerJS.*?})</script>', webpage), (..., {json.loads}))
data = get_first(post_data, (
'require', ..., ..., ..., '__bbox', 'require', ..., ..., ...,
'entryPointRoot', 'otherProps', 'deeplinkAdCard', 'snapshot', {dict}))
if not data:
raise ExtractorError('Unable to extract ad data')
title = data.get('title')
if not title or title == '{{product.name}}':
title = join_nonempty('display_format', 'page_name', delim=' by ', from_dict=data)
markup_id = traverse_obj(data, ('body', '__m', {str}))
markup = traverse_obj(post_data, (
..., 'require', ..., ..., ..., '__bbox', 'markup', lambda _, v: v[0].startswith(markup_id),
..., '__html', {clean_html}, {lambda x: not x.startswith('{{product.') and x}, any))
info_dict = traverse_obj(data, {
'description': ('link_description', {str}, {lambda x: x if x != '{{product.description}}' else None}),
info_dict = merge_dicts({
'title': title,
'description': markup or None,
}, traverse_obj(data, {
'description': ('link_description', {lambda x: x if not x.startswith('{{product.') else None}),
'uploader': ('page_name', {str}),
'uploader_id': ('page_id', {str_or_none}),
'uploader_url': ('page_profile_uri', {url_or_none}),
'timestamp': ('creation_time', {int_or_none}),
'like_count': ('page_like_count', {int_or_none}),
})
}))
entries = []
for idx, entry in enumerate(traverse_obj(
data, (('videos', 'cards'), lambda _, v: any(url_or_none(v[f]) for f in self._FORMATS_MAP))), 1,
data, (('videos', 'cards'), lambda _, v: any(url_or_none(v.get(f)) for f in self._FORMATS_MAP))), 1,
):
entries.append({
'id': f'{video_id}_{idx}',
'title': entry.get('title') or title,
'description': entry.get('link_description') or info_dict.get('description'),
'description': traverse_obj(entry, 'body', 'link_description') or info_dict.get('description'),
'thumbnail': url_or_none(entry.get('video_preview_image_url')),
'formats': self._extract_formats(entry),
})

View File

@@ -14,7 +14,7 @@ from ..utils import (
class FC2IE(InfoExtractor):
_VALID_URL = r'^(?:https?://video\.fc2\.com/(?:[^/]+/)*content/|fc2:)(?P<id>[^/]+)'
_VALID_URL = r'(?:https?://video\.fc2\.com/(?:[^/]+/)*content/|fc2:)(?P<id>[^/]+)'
IE_NAME = 'fc2'
_NETRC_MACHINE = 'fc2'
_TESTS = [{

View File

@@ -43,6 +43,7 @@ from ..utils import (
xpath_text,
xpath_with_ns,
)
from ..utils._utils import _UnsafeExtensionError
class GenericIE(InfoExtractor):
@@ -2339,7 +2340,7 @@ class GenericIE(InfoExtractor):
default_search = 'fixup_error'
if default_search in ('auto', 'auto_warning', 'fixup_error'):
if re.match(r'^[^\s/]+\.[^\s/]+/', url):
if re.match(r'[^\s/]+\.[^\s/]+/', url):
self.report_warning('The url doesn\'t specify the protocol, trying with http')
return self.url_result('http://' + url)
elif default_search != 'fixup_error':
@@ -2399,7 +2400,7 @@ class GenericIE(InfoExtractor):
# Check for direct link to a video
content_type = full_response.headers.get('Content-Type', '').lower()
m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
m = re.match(r'(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
if m:
self.report_detected('direct video link')
headers = filter_dict({'Referer': smuggled_data.get('referer')})
@@ -2446,9 +2447,13 @@ class GenericIE(InfoExtractor):
if not is_html(first_bytes):
self.report_warning(
'URL could be a direct video link, returning it as such.')
ext = determine_ext(url)
if ext not in _UnsafeExtensionError.ALLOWED_EXTENSIONS:
ext = 'unknown_video'
info_dict.update({
'direct': True,
'url': url,
'ext': ext,
})
return info_dict

View File

@@ -0,0 +1,91 @@
from .common import InfoExtractor
from .vimeo import VimeoIE
from ..utils import (
parse_qs,
traverse_obj,
url_or_none,
)
class GermanupaIE(InfoExtractor):
IE_DESC = 'germanupa.de'
_VALID_URL = r'https?://germanupa\.de/mediathek/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://germanupa.de/mediathek/4-figma-beratung-deine-sprechstunde-fuer-figma-fragen',
'info_dict': {
'id': '909179246',
'title': 'Tutorial: #4 Figma Beratung - Deine Sprechstunde für Figma-Fragen',
'ext': 'mp4',
'uploader': 'German UPA',
'uploader_id': 'germanupa',
'thumbnail': 'https://i.vimeocdn.com/video/1792564420-7415283ccef8bf8702dab8c6b7515555ceeb7a1c11371ffcc133b8e887dbf70e-d_1280',
'uploader_url': 'https://vimeo.com/germanupa',
'duration': 3987,
},
'expected_warnings': ['Failed to parse XML: not well-formed'],
'params': {'skip_download': 'm3u8'},
}, {
'note': 'audio, uses GenericIE',
'url': 'https://germanupa.de/mediathek/live-vom-ux-festival-neuigkeiten-von-figma-jobmarkt-agenturszene-interview-zu-sustainable',
'info_dict': {
'id': '1867346676',
'title': 'Live vom UX Festival: Neuigkeiten von Figma, Jobmarkt, Agenturszene & Interview zu Sustainable UX',
'ext': 'opus',
'timestamp': 1720545088,
'upload_date': '20240709',
'duration': 3910.557,
'like_count': int,
'description': 'md5:db2aed5ff131e177a7b33901e9a8db05',
'uploader': 'German UPA',
'repost_count': int,
'genres': ['Science'],
'license': 'all-rights-reserved',
'uploader_url': 'https://soundcloud.com/user-80097677',
'uploader_id': '471579486',
'view_count': int,
'comment_count': int,
'thumbnail': 'https://i1.sndcdn.com/artworks-oCti2e9GhaZFWBqY-48ybGw-original.jpg',
},
}, {
'note': 'Nur für Mitglieder/Just for members',
'url': 'https://germanupa.de/mediathek/ux-festival-2024-usability-tests-und-ai',
'info_dict': {
'id': '986994430',
'title': 'UX Festival 2024 "Usability Tests und AI" von Lennart Weber',
'ext': 'mp4',
'release_date': '20240719',
'uploader_url': 'https://vimeo.com/germanupa',
'timestamp': 1721373980,
'license': 'by-sa',
'like_count': int,
'thumbnail': 'https://i.vimeocdn.com/video/1904187064-2a672630c30f9ad787bd390bff3f51d7506a3e8416763ba6dbf465732b165c5c-d_1280',
'duration': 2146,
'release_timestamp': 1721373980,
'uploader': 'German UPA',
'uploader_id': 'germanupa',
'upload_date': '20240719',
'comment_count': int,
},
'expected_warnings': ['Failed to parse XML: not well-formed'],
'skip': 'login required',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
param_url = traverse_obj(
self._search_regex(
r'<iframe[^>]+data-src\s*?=\s*?([\'"])(?P<url>https://germanupa\.de/media/oembed\?url=(?:(?!\1).)+)\1',
webpage, 'embedded video', default=None, group='url'),
({parse_qs}, 'url', 0, {url_or_none}))
if not param_url:
if self._search_regex(
r'<div[^>]+class\s*?=\s*?([\'"])(?:(?!\1).)*login-wrapper(?:(?!\1).)*\1',
webpage, 'login wrapper', default=None):
self.raise_login_required('This video is only available for members')
return self.url_result(url, 'Generic') # Fall back to generic to extract audio
real_url = param_url.replace('https://vimeo.com/', 'https://player.vimeo.com/video/')
return self.url_result(VimeoIE._smuggle_referrer(real_url, url), VimeoIE, video_id)

View File

@@ -52,7 +52,7 @@ class GetCourseRuIE(InfoExtractor):
_BASE_URL_RE = rf'https?://(?:(?!player02\.)[^.]+\.getcourse\.(?:ru|io)|{"|".join(map(re.escape, _DOMAINS))})'
_VALID_URL = [
rf'{_BASE_URL_RE}/(?!pl/|teach/)(?P<id>[^?#]+)',
rf'{_BASE_URL_RE}/(:?pl/)?teach/control/lesson/view\?(?:[^#]+&)?id=(?P<id>\d+)',
rf'{_BASE_URL_RE}/(?:pl/)?teach/control/lesson/view\?(?:[^#]+&)?id=(?P<id>\d+)',
]
_TESTS = [{
'url': 'http://academymel.online/3video_1',

View File

@@ -7,7 +7,7 @@ from ..utils import (
class GolemIE(InfoExtractor):
_VALID_URL = r'^https?://video\.golem\.de/.+?/(?P<id>.+?)/'
_VALID_URL = r'https?://video\.golem\.de/.+?/(?P<id>.+?)/'
_TEST = {
'url': 'http://video.golem.de/handy/14095/iphone-6-und-6-plus-test.html',
'md5': 'c1a2c0a3c863319651c7c992c5ee29bf',

View File

@@ -13,7 +13,7 @@ from ..utils import (
class HRFernsehenIE(InfoExtractor):
IE_NAME = 'hrfernsehen'
_VALID_URL = r'^https?://www\.(?:hr-fernsehen|hessenschau)\.de/.*,video-(?P<id>[0-9]{6})\.html'
_VALID_URL = r'https?://www\.(?:hr-fernsehen|hessenschau)\.de/.*,video-(?P<id>[0-9]{6})\.html'
_TESTS = [{
'url': 'https://www.hessenschau.de/tv-sendung/hessenschau-vom-26082020,video-130546.html',
'md5': '5c4e0ba94677c516a2f65a84110fc536',

View File

@@ -8,15 +8,19 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
parse_duration,
str_or_none,
try_get,
unescapeHTML,
unified_strdate,
update_url_query,
url_or_none,
)
from ..utils.traversal import traverse_obj
class HuyaLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.|m\.)?huya\.com/(?P<id>[^/#?&]+)(?:\D|$)'
_VALID_URL = r'https?://(?:www\.|m\.)?huya\.com/(?!(?:video/play/))(?P<id>[^/#?&]+)(?:\D|$)'
IE_NAME = 'huya:live'
IE_DESC = 'huya.com'
TESTS = [{
@@ -24,6 +28,7 @@ class HuyaLiveIE(InfoExtractor):
'info_dict': {
'id': '572329',
'title': str,
'ext': 'flv',
'description': str,
'is_live': True,
'view_count': int,
@@ -131,3 +136,76 @@ class HuyaLiveIE(InfoExtractor):
fm = base64.b64decode(params['fm']).decode().split('_', 1)[0]
ss = hashlib.md5('|'.join([params['seqid'], params['ctype'], params['t']]))
return fm, ss
class HuyaVideoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?huya\.com/video/play/(?P<id>\d+)\.html'
IE_NAME = 'huya:video'
IE_DESC = '虎牙视频'
_TESTS = [{
'url': 'https://www.huya.com/video/play/1002412640.html',
'info_dict': {
'id': '1002412640',
'ext': 'mp4',
'title': '8月3日',
'thumbnail': r're:https?://.*\.jpg',
'duration': 14,
'uploader': '虎牙-ATS欧卡车队青木',
'uploader_id': '1564376151',
'upload_date': '20240803',
'view_count': int,
'comment_count': int,
'like_count': int,
},
},
{
'url': 'https://www.huya.com/video/play/556054543.html',
'info_dict': {
'id': '556054543',
'ext': 'mp4',
'title': '我不挑事 也不怕事',
'thumbnail': r're:https?://.*\.jpg',
'duration': 1864,
'uploader': '卡尔',
'uploader_id': '367138632',
'upload_date': '20210811',
'view_count': int,
'comment_count': int,
'like_count': int,
},
}]
def _real_extract(self, url: str):
video_id = self._match_id(url)
video_data = self._download_json(
'https://liveapi.huya.com/moment/getMomentContent', video_id,
query={'videoId': video_id})['data']['moment']['videoInfo']
formats = []
for definition in traverse_obj(video_data, ('definitions', lambda _, v: url_or_none(v['url']))):
formats.append({
'url': definition['url'],
**traverse_obj(definition, {
'format_id': ('defName', {str}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'filesize': ('size', {int_or_none}),
}),
})
return {
'id': video_id,
'formats': formats,
**traverse_obj(video_data, {
'title': ('videoTitle', {str}),
'thumbnail': ('videoCover', {url_or_none}),
'duration': ('videoDuration', {parse_duration}),
'uploader': ('nickName', {str}),
'uploader_id': ('uid', {str_or_none}),
'upload_date': ('videoUploadTime', {unified_strdate}),
'view_count': ('videoPlayNum', {int_or_none}),
'comment_count': ('videoCommentNum', {int_or_none}),
'like_count': ('favorCount', {int_or_none}),
}),
}

View File

@@ -25,9 +25,29 @@ class IPrimaIE(InfoExtractor):
'id': 'p51388',
'ext': 'mp4',
'title': 'Partička (92)',
'description': 'md5:859d53beae4609e6dd7796413f1b6cac',
'upload_date': '20201103',
'timestamp': 1604437480,
'description': 'md5:57943f6a50d6188288c3a579d2fd5f01',
'episode': 'Partička (92)',
'season': 'Partička',
'series': 'Prima Partička',
'episode_number': 92,
'thumbnail': 'https://d31b9s05ygj54s.cloudfront.net/prima-plus/image/video-ef6cf9de-c980-4443-92e4-17fe8bccd45c-16x9.jpeg',
},
'params': {
'skip_download': True, # m3u8 download
},
}, {
'url': 'https://zoom.iprima.cz/porady/krasy-kanarskych-ostrovu/tenerife-v-risi-ohne',
'info_dict': {
'id': 'p1412199',
'ext': 'mp4',
'episode_number': 3,
'episode': 'Tenerife: V říši ohně',
'description': 'md5:4b4a05c574b5eaef130e68d4811c3f2c',
'duration': 3111.0,
'thumbnail': 'https://d31b9s05ygj54s.cloudfront.net/prima-plus/image/video-f66dd7fb-c1a0-47d1-b3bc-7db328d566c5-16x9-1711636518.jpg/t_16x9_medium_1366_768',
'title': 'Tenerife: V říši ohně',
'timestamp': 1711825800,
'upload_date': '20240330',
},
'params': {
'skip_download': True, # m3u8 download
@@ -131,6 +151,7 @@ class IPrimaIE(InfoExtractor):
video_id = self._search_regex((
r'productId\s*=\s*([\'"])(?P<id>p\d+)\1',
r'pproduct_id\s*=\s*([\'"])(?P<id>p\d+)\1',
r'let\s+videos\s*=\s*([\'"])(?P<id>p\d+)\1',
), webpage, 'real id', group='id', default=None)
if not video_id:
@@ -176,7 +197,7 @@ class IPrimaIE(InfoExtractor):
final_result = self._search_json_ld(webpage, video_id, default={})
final_result.update({
'id': video_id,
'title': title,
'title': final_result.get('title') or title,
'thumbnail': self._html_search_meta(
['thumbnail', 'og:image', 'twitter:image'],
webpage, 'thumbnail', default=None),

View File

@@ -194,11 +194,14 @@ class ShugiinItvVodIE(ShugiinItvBaseIE):
class SangiinInstructionIE(InfoExtractor):
_VALID_URL = r'^https?://www\.webtv\.sangiin\.go\.jp/webtv/index\.php'
_VALID_URL = r'https?://www\.webtv\.sangiin\.go\.jp/webtv/index\.php'
IE_DESC = False # this shouldn't be listed as a supported site
def _real_extract(self, url):
raise ExtractorError('Copy the link from the botton below the video description or player, and use the link to download. If there are no button in the frame, get the URL of the frame showing the video.', expected=True)
raise ExtractorError(
'Copy the link from the button below the video description/player '
'and use that link to download. If there is no button in the frame, '
'get the URL of the frame showing the video.', expected=True)
class SangiinIE(InfoExtractor):

View File

@@ -158,7 +158,7 @@ class JioSaavnAlbumIE(JioSaavnBaseIE):
class JioSaavnPlaylistIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:playlist'
_VALID_URL = r'https?://(?:www\.)?(?:jio)?saavn\.com/s/playlist/(?:[^/?#]+/){2}(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?:jio)?saavn\.com/(?:s/playlist/(?:[^/?#]+/){2}|featured/[^/?#]+/)(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.jiosaavn.com/s/playlist/2279fbe391defa793ad7076929a2f5c9/mood-english/LlJ8ZWT1ibN5084vKHRj2Q__',
'info_dict': {
@@ -173,6 +173,13 @@ class JioSaavnPlaylistIE(JioSaavnBaseIE):
'title': 'Mood Hindi',
},
'playlist_mincount': 801,
}, {
'url': 'https://www.jiosaavn.com/featured/taaza-tunes/Me5RridRfDk_',
'info_dict': {
'id': 'Me5RridRfDk_',
'title': 'Taaza Tunes',
},
'playlist_mincount': 301,
}]
_PAGE_SIZE = 50

View File

@@ -22,7 +22,7 @@ class KalturaIE(InfoExtractor):
(?:
kaltura:(?P<partner_id>\w+):(?P<id>\w+)(?::(?P<player_type>\w+))?|
https?://
(:?(?:www|cdnapi(?:sec)?)\.)?kaltura\.com(?::\d+)?/
(?:(?:www|cdnapi(?:sec)?)\.)?kaltura\.com(?::\d+)?/
(?:
(?:
# flash player

View File

@@ -15,7 +15,7 @@ from ..utils import (
class KhanAcademyBaseIE(InfoExtractor):
_VALID_URL_TEMPL = r'https?://(?:www\.)?khanacademy\.org/(?P<id>(?:[^/]+/){%s}%s[^?#/&]+)'
_PUBLISHED_CONTENT_VERSION = '171419ab20465d931b356f22d20527f13969bb70'
_PUBLISHED_CONTENT_VERSION = 'dc34750f0572c80f5effe7134082fe351143c1e4'
def _parse_video(self, video):
return {
@@ -39,7 +39,7 @@ class KhanAcademyBaseIE(InfoExtractor):
query={
'fastly_cacheable': 'persist_until_publish',
'pcv': self._PUBLISHED_CONTENT_VERSION,
'hash': '1242644265',
'hash': '3712657851',
'variables': json.dumps({
'path': display_id,
'countryCode': 'US',

View File

@@ -1,9 +1,14 @@
import functools
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import (
UserNotLive,
determine_ext,
float_or_none,
int_or_none,
merge_dicts,
parse_iso8601,
str_or_none,
traverse_obj,
unified_timestamp,
@@ -25,104 +30,212 @@ class KickBaseIE(InfoExtractor):
def _call_api(self, path, display_id, note='Downloading API JSON', headers={}, **kwargs):
return self._download_json(
f'https://kick.com/api/v1/{path}', display_id, note=note,
f'https://kick.com/api/{path}', display_id, note=note,
headers=merge_dicts(headers, self._API_HEADERS), impersonate=True, **kwargs)
class KickIE(KickBaseIE):
IE_NAME = 'kick:live'
_VALID_URL = r'https?://(?:www\.)?kick\.com/(?!(?:video|categories|search|auth)(?:[/?#]|$))(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://kick.com/yuppy',
'url': 'https://kick.com/buddha',
'info_dict': {
'id': '6cde1-kickrp-joe-flemmingskick-info-heremust-knowmust-see21',
'id': '92722911-nopixel-40',
'ext': 'mp4',
'title': str,
'description': str,
'channel': 'yuppy',
'channel_id': '33538',
'uploader': 'Yuppy',
'uploader_id': '33793',
'upload_date': str,
'live_status': 'is_live',
'timestamp': int,
'thumbnail': r're:^https?://.*\.jpg',
'thumbnail': r're:https?://.+\.jpg',
'categories': list,
'upload_date': str,
'channel': 'buddha',
'channel_id': '32807',
'uploader': 'Buddha',
'uploader_id': '33057',
'live_status': 'is_live',
'concurrent_view_count': int,
'release_timestamp': int,
'age_limit': 18,
'release_date': str,
},
'skip': 'livestream',
'params': {'skip_download': 'livestream'},
# 'skip': 'livestream',
}, {
'url': 'https://kick.com/kmack710',
'url': 'https://kick.com/xqc',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if (KickVODIE.suitable(url) or KickClipIE.suitable(url)) else super().suitable(url)
def _real_extract(self, url):
channel = self._match_id(url)
response = self._call_api(f'channels/{channel}', channel)
response = self._call_api(f'v2/channels/{channel}', channel)
if not traverse_obj(response, 'livestream', expected_type=dict):
raise UserNotLive(video_id=channel)
return {
'id': str(traverse_obj(
response, ('livestream', ('slug', 'id')), get_all=False, default=channel)),
'formats': self._extract_m3u8_formats(
response['playback_url'], channel, 'mp4', live=True),
'title': traverse_obj(
response, ('livestream', ('session_title', 'slug')), get_all=False, default=''),
'description': traverse_obj(response, ('user', 'bio')),
'channel': channel,
'channel_id': str_or_none(traverse_obj(response, 'id', ('livestream', 'channel_id'))),
'uploader': traverse_obj(response, 'name', ('user', 'username')),
'uploader_id': str_or_none(traverse_obj(response, 'user_id', ('user', 'id'))),
'is_live': True,
'timestamp': unified_timestamp(traverse_obj(response, ('livestream', 'created_at'))),
'thumbnail': traverse_obj(
response, ('livestream', 'thumbnail', 'url'), expected_type=url_or_none),
'categories': traverse_obj(response, ('recent_categories', ..., 'name')),
'formats': self._extract_m3u8_formats(response['playback_url'], channel, 'mp4', live=True),
**traverse_obj(response, {
'id': ('livestream', 'slug', {str}),
'title': ('livestream', 'session_title', {str}),
'description': ('user', 'bio', {str}),
'channel_id': (('id', ('livestream', 'channel_id')), {int}, {str_or_none}, any),
'uploader': (('name', ('user', 'username')), {str}, any),
'uploader_id': (('user_id', ('user', 'id')), {int}, {str_or_none}, any),
'timestamp': ('livestream', 'created_at', {unified_timestamp}),
'release_timestamp': ('livestream', 'start_time', {unified_timestamp}),
'thumbnail': ('livestream', 'thumbnail', 'url', {url_or_none}),
'categories': ('recent_categories', ..., 'name', {str}),
'concurrent_view_count': ('livestream', 'viewer_count', {int_or_none}),
'age_limit': ('livestream', 'is_mature', {bool}, {lambda x: 18 if x else 0}),
}),
}
class KickVODIE(KickBaseIE):
_VALID_URL = r'https?://(?:www\.)?kick\.com/video/(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})'
IE_NAME = 'kick:vod'
_VALID_URL = r'https?://(?:www\.)?kick\.com/[\w-]+/videos/(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})'
_TESTS = [{
'url': 'https://kick.com/video/58bac65b-e641-4476-a7ba-3707a35e60e3',
'url': 'https://kick.com/xqc/videos/8dd97a8d-e17f-48fb-8bc3-565f88dbc9ea',
'md5': '3870f94153e40e7121a6e46c068b70cb',
'info_dict': {
'id': '58bac65b-e641-4476-a7ba-3707a35e60e3',
'id': '8dd97a8d-e17f-48fb-8bc3-565f88dbc9ea',
'ext': 'mp4',
'title': '🤠REBIRTH IS BACK!!!!🤠!stake CODE JAREDFPS 🤠',
'description': 'md5:02b0c46f9b4197fb545ab09dddb85b1d',
'channel': 'jaredfps',
'channel_id': '26608',
'uploader': 'JaredFPS',
'uploader_id': '26799',
'upload_date': '20240402',
'timestamp': 1712097108,
'duration': 33859.0,
'title': '18+ #ad 🛑LIVE🛑CLICK🛑DRAMA🛑NEWS🛑STUFF🛑REACT🛑GET IN HHERE🛑BOP BOP🛑WEEEE WOOOO🛑',
'description': 'THE BEST AT ABSOLUTELY EVERYTHING. THE JUICER. LEADER OF THE JUICERS.',
'channel': 'xqc',
'channel_id': '668',
'uploader': 'xQc',
'uploader_id': '676',
'upload_date': '20240909',
'timestamp': 1725919141,
'duration': 10155.0,
'thumbnail': r're:^https?://.*\.jpg',
'categories': ['Call of Duty: Warzone'],
'view_count': int,
'categories': ['Just Chatting'],
'age_limit': 0,
},
'params': {
'skip_download': 'm3u8',
},
'expected_warnings': [r'impersonation'],
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
response = self._call_api(f'video/{video_id}', video_id)
response = self._call_api(f'v1/video/{video_id}', video_id)
return {
'id': video_id,
'formats': self._extract_m3u8_formats(response['source'], video_id, 'mp4'),
'title': traverse_obj(
response, ('livestream', ('session_title', 'slug')), get_all=False, default=''),
'description': traverse_obj(response, ('livestream', 'channel', 'user', 'bio')),
'channel': traverse_obj(response, ('livestream', 'channel', 'slug')),
'channel_id': str_or_none(traverse_obj(response, ('livestream', 'channel', 'id'))),
'uploader': traverse_obj(response, ('livestream', 'channel', 'user', 'username')),
'uploader_id': str_or_none(traverse_obj(response, ('livestream', 'channel', 'user_id'))),
'timestamp': unified_timestamp(response.get('created_at')),
'duration': float_or_none(traverse_obj(response, ('livestream', 'duration')), scale=1000),
'thumbnail': traverse_obj(
response, ('livestream', 'thumbnail'), expected_type=url_or_none),
'categories': traverse_obj(response, ('livestream', 'categories', ..., 'name')),
**traverse_obj(response, {
'title': ('livestream', ('session_title', 'slug'), {str}, any),
'description': ('livestream', 'channel', 'user', 'bio', {str}),
'channel': ('livestream', 'channel', 'slug', {str}),
'channel_id': ('livestream', 'channel', 'id', {int}, {str_or_none}),
'uploader': ('livestream', 'channel', 'user', 'username', {str}),
'uploader_id': ('livestream', 'channel', 'user_id', {int}, {str_or_none}),
'timestamp': ('created_at', {parse_iso8601}),
'duration': ('livestream', 'duration', {functools.partial(float_or_none, scale=1000)}),
'thumbnail': ('livestream', 'thumbnail', {url_or_none}),
'categories': ('livestream', 'categories', ..., 'name', {str}),
'view_count': ('views', {int_or_none}),
'age_limit': ('livestream', 'is_mature', {bool}, {lambda x: 18 if x else 0}),
}),
}
class KickClipIE(KickBaseIE):
IE_NAME = 'kick:clips'
_VALID_URL = r'https?://(?:www\.)?kick\.com/[\w-]+(?:/clips/|/?\?(?:[^#]+&)?clip=)(?P<id>clip_[\w-]+)'
_TESTS = [{
'url': 'https://kick.com/mxddy?clip=clip_01GYXVB5Y8PWAPWCWMSBCFB05X',
'info_dict': {
'id': 'clip_01GYXVB5Y8PWAPWCWMSBCFB05X',
'ext': 'mp4',
'title': 'Maddy detains Abd D:',
'channel': 'mxddy',
'channel_id': '133789',
'uploader': 'AbdCreates',
'uploader_id': '3309077',
'thumbnail': r're:^https?://.*\.jpeg',
'duration': 35,
'timestamp': 1682481453,
'upload_date': '20230426',
'view_count': int,
'like_count': int,
'categories': ['VALORANT'],
'age_limit': 18,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://kick.com/destiny?clip=clip_01H9SKET879NE7N9RJRRDS98J3',
'info_dict': {
'id': 'clip_01H9SKET879NE7N9RJRRDS98J3',
'title': 'W jews',
'ext': 'mp4',
'channel': 'destiny',
'channel_id': '1772249',
'uploader': 'punished_furry',
'uploader_id': '2027722',
'duration': 49.0,
'upload_date': '20230908',
'timestamp': 1694150180,
'thumbnail': 'https://clips.kick.com/clips/j3/clip_01H9SKET879NE7N9RJRRDS98J3/thumbnail.png',
'view_count': int,
'like_count': int,
'categories': ['Just Chatting'],
'age_limit': 0,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://kick.com/spreen/clips/clip_01J8RGZRKHXHXXKJEHGRM932A5',
'info_dict': {
'id': 'clip_01J8RGZRKHXHXXKJEHGRM932A5',
'ext': 'mp4',
'title': 'KLJASLDJKLJKASDLJKDAS',
'channel': 'spreen',
'channel_id': '5312671',
'uploader': 'AnormalBarraBaja',
'uploader_id': '26518262',
'duration': 43.0,
'upload_date': '20240927',
'timestamp': 1727399987,
'thumbnail': 'https://clips.kick.com/clips/f2/clip_01J8RGZRKHXHXXKJEHGRM932A5/thumbnail.webp',
'view_count': int,
'like_count': int,
'categories': ['Minecraft'],
'age_limit': 0,
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
clip_id = self._match_id(url)
clip = self._call_api(f'v2/clips/{clip_id}/play', clip_id)['clip']
clip_url = clip['clip_url']
if determine_ext(clip_url) == 'm3u8':
formats = self._extract_m3u8_formats(clip_url, clip_id, 'mp4')
else:
formats = [{'url': clip_url}]
return {
'id': clip_id,
'formats': formats,
**traverse_obj(clip, {
'title': ('title', {str}),
'channel': ('channel', 'slug', {str}),
'channel_id': ('channel', 'id', {int}, {str_or_none}),
'uploader': ('creator', 'username', {str}),
'uploader_id': ('creator', 'id', {int}, {str_or_none}),
'thumbnail': ('thumbnail_url', {url_or_none}),
'duration': ('duration', {float_or_none}),
'categories': ('category', 'name', {str}, all),
'timestamp': ('created_at', {parse_iso8601}),
'view_count': ('views', {int_or_none}),
'like_count': ('likes', {int_or_none}),
'age_limit': ('is_mature', {bool}, {lambda x: 18 if x else 0}),
}),
}

126
yt_dlp/extractor/kika.py Normal file
View File

@@ -0,0 +1,126 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
parse_duration,
parse_iso8601,
url_or_none,
)
from ..utils.traversal import traverse_obj
class KikaIE(InfoExtractor):
IE_DESC = 'KiKA.de'
_VALID_URL = r'https?://(?:www\.)?kika\.de/[\w/-]+/videos/(?P<id>[a-z-]+\d+)'
_GEO_COUNTRIES = ['DE']
_TESTS = [{
'url': 'https://www.kika.de/logo/videos/logo-vom-samstag-einunddreissig-august-zweitausendvierundzwanzig-100',
'md5': 'fbfc8da483719ef06f396e5e5b938c69',
'info_dict': {
'id': 'logo-vom-samstag-einunddreissig-august-zweitausendvierundzwanzig-100',
'ext': 'mp4',
'upload_date': '20240831',
'timestamp': 1725126600,
'season_number': 2024,
'modified_date': '20240831',
'episode': 'Episode 476',
'episode_number': 476,
'season': 'Season 2024',
'duration': 634,
'title': 'logo! vom Samstag, 31. August 2024',
'modified_timestamp': 1725129983,
},
}, {
'url': 'https://www.kika.de/kaltstart/videos/video92498',
'md5': '710ece827e5055094afeb474beacb7aa',
'info_dict': {
'id': 'video92498',
'ext': 'mp4',
'title': '7. Wo ist Leo?',
'description': 'md5:fb48396a5b75068bcac1df74f1524920',
'duration': 436,
'timestamp': 1702926876,
'upload_date': '20231218',
'episode_number': 7,
'modified_date': '20240319',
'modified_timestamp': 1710880610,
'episode': 'Episode 7',
'season_number': 1,
'season': 'Season 1',
},
}, {
'url': 'https://www.kika.de/bernd-das-brot/astrobrot/videos/video90088',
'md5': 'ffd1b700d7de0a6616a1d08544c77294',
'info_dict': {
'id': 'video90088',
'ext': 'mp4',
'upload_date': '20221102',
'timestamp': 1667390580,
'duration': 197,
'modified_timestamp': 1711093771,
'episode_number': 8,
'title': 'Es ist nicht leicht, ein Astrobrot zu sein',
'modified_date': '20240322',
'description': 'md5:d3641deaf1b5515a160788b2be4159a9',
'season_number': 1,
'episode': 'Episode 8',
'season': 'Season 1',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
doc = self._download_json(f'https://www.kika.de/_next-api/proxy/v1/videos/{video_id}', video_id)
video_assets = self._download_json(doc['assets']['url'], video_id)
subtitles = {}
if ttml_resource := url_or_none(video_assets.get('videoSubtitle')):
subtitles['de'] = [{
'url': ttml_resource,
'ext': 'ttml',
}]
if webvtt_resource := url_or_none(video_assets.get('webvttUrl')):
subtitles.setdefault('de', []).append({
'url': webvtt_resource,
'ext': 'vtt',
})
return {
'id': video_id,
'formats': list(self._extract_formats(video_assets, video_id)),
'subtitles': subtitles,
**traverse_obj(doc, {
'title': ('title', {str}),
'description': ('description', {str}),
'timestamp': ('date', {parse_iso8601}),
'modified_timestamp': ('modificationDate', {parse_iso8601}),
'duration': ((
('durationInSeconds', {int_or_none}),
('duration', {parse_duration})), any),
'episode_number': ('episodeNumber', {int_or_none}),
'season_number': ('season', {int_or_none}),
}),
}
def _extract_formats(self, media_info, video_id):
for media in traverse_obj(media_info, ('assets', lambda _, v: url_or_none(v['url']))):
stream_url = media['url']
ext = determine_ext(stream_url)
if ext == 'm3u8':
yield from self._extract_m3u8_formats(
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
else:
yield {
'url': stream_url,
'format_id': ext,
**traverse_obj(media, {
'width': ('frameWidth', {int_or_none}),
'height': ('frameHeight', {int_or_none}),
# NB: filesize is 0 if unknown, bitrate is -1 if unknown
'filesize': ('fileSize', {int_or_none}, {lambda x: x or None}),
'abr': ('bitrateAudio', {int_or_none}, {lambda x: None if x == -1 else x}),
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
}),
}

View File

@@ -0,0 +1,78 @@
import functools
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
extract_attributes,
get_element_by_class,
get_element_html_by_id,
join_nonempty,
parse_duration,
unified_timestamp,
)
from ..utils.traversal import traverse_obj
class LearningOnScreenIE(InfoExtractor):
_VALID_URL = r'https?://learningonscreen\.ac\.uk/ondemand/index\.php/prog/(?P<id>\w+)'
_TESTS = [{
'url': 'https://learningonscreen.ac.uk/ondemand/index.php/prog/005D81B2?bcast=22757013',
'info_dict': {
'id': '005D81B2',
'ext': 'mp4',
'title': 'Planet Earth',
'duration': 3600.0,
'timestamp': 1164567600.0,
'upload_date': '20061126',
'thumbnail': 'https://stream.learningonscreen.ac.uk/trilt-cover-images/005D81B2-Planet-Earth-2006-11-26T190000Z-BBC4.jpg',
},
}]
def _real_initialize(self):
if not self._get_cookies('https://learningonscreen.ac.uk/').get('PHPSESSID-BOB-LIVE'):
self.raise_login_required(
'Use --cookies for authentication. See '
' https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp '
'for how to manually pass cookies', method=None)
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
details = traverse_obj(webpage, (
{functools.partial(get_element_html_by_id, 'programme-details')}, {
'title': ({functools.partial(re.search, r'<h2>([^<]+)</h2>')}, 1, {clean_html}),
'timestamp': (
{functools.partial(get_element_by_class, 'broadcast-date')},
{functools.partial(re.match, r'([^<]+)')}, 1, {unified_timestamp}),
'duration': (
{functools.partial(get_element_by_class, 'prog-running-time')},
{clean_html}, {parse_duration}),
}))
title = details.pop('title', None) or traverse_obj(webpage, (
{functools.partial(get_element_html_by_id, 'add-to-existing-playlist')},
{extract_attributes}, 'data-record-title', {clean_html}))
entries = self._parse_html5_media_entries(
'https://stream.learningonscreen.ac.uk', webpage, video_id, m3u8_id='hls', mpd_id='dash',
_headers={'Origin': 'https://learningonscreen.ac.uk', 'Referer': 'https://learningonscreen.ac.uk/'})
if not entries:
raise ExtractorError('No video found')
if len(entries) > 1:
duration = details.pop('duration', None)
for idx, entry in enumerate(entries, start=1):
entry.update(details)
entry['id'] = join_nonempty(video_id, idx)
entry['title'] = join_nonempty(title, idx)
return self.playlist_result(entries, video_id, title, duration=duration)
return {
**entries[0],
**details,
'id': video_id,
'title': title,
}

View File

@@ -1,86 +1,11 @@
from .common import InfoExtractor
from ..utils import (
clean_html,
format_field,
int_or_none,
parse_iso8601,
unified_strdate,
)
class LnkGoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?lnk(?:go)?\.(?:alfa\.)?lt/(?:visi-video/[^/]+|video)/(?P<id>[A-Za-z0-9-]+)(?:/(?P<episode_id>\d+))?'
_TESTS = [{
'url': 'http://www.lnkgo.lt/visi-video/aktualai-pratesimas/ziurek-putka-trys-klausimai',
'info_dict': {
'id': '10809',
'ext': 'mp4',
'title': "Put'ka: Trys Klausimai",
'upload_date': '20161216',
'description': 'Seniai matytas Putka užduoda tris klausimėlius. Pabandykime surasti atsakymus.',
'age_limit': 18,
'duration': 117,
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1481904000,
},
'params': {
'skip_download': True, # HLS download
},
}, {
'url': 'http://lnkgo.alfa.lt/visi-video/aktualai-pratesimas/ziurek-nerdas-taiso-kompiuteri-2',
'info_dict': {
'id': '10467',
'ext': 'mp4',
'title': 'Nėrdas: Kompiuterio Valymas',
'upload_date': '20150113',
'description': 'md5:7352d113a242a808676ff17e69db6a69',
'age_limit': 18,
'duration': 346,
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421164800,
},
'params': {
'skip_download': True, # HLS download
},
}, {
'url': 'https://lnk.lt/video/neigalieji-tv-bokste/37413',
'only_matching': True,
}]
_AGE_LIMITS = {
'N-7': 7,
'N-14': 14,
'S': 18,
}
_M3U8_TEMPL = 'https://vod.lnk.lt/lnk_vod/lnk/lnk/%s:%s/playlist.m3u8%s'
def _real_extract(self, url):
display_id, video_id = self._match_valid_url(url).groups()
video_info = self._download_json(
'https://lnk.lt/api/main/video-page/{}/{}/false'.format(display_id, video_id or '0'),
display_id)['videoConfig']['videoInfo']
video_id = str(video_info['id'])
title = video_info['title']
prefix = 'smil' if video_info.get('isQualityChangeAvailable') else 'mp4'
formats = self._extract_m3u8_formats(
self._M3U8_TEMPL % (prefix, video_info['videoUrl'], video_info.get('secureTokenParams') or ''),
video_id, 'mp4', 'm3u8_native')
return {
'id': video_id,
'display_id': display_id,
'title': title,
'formats': formats,
'thumbnail': format_field(video_info, 'posterImage', 'https://lnk.lt/all-images/%s'),
'duration': int_or_none(video_info.get('duration')),
'description': clean_html(video_info.get('htmlDescription')),
'age_limit': self._AGE_LIMITS.get(video_info.get('pgRating'), 0),
'timestamp': parse_iso8601(video_info.get('airDate')),
'view_count': int_or_none(video_info.get('viewsCount')),
}
class LnkIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?lnk\.lt/[^/]+/(?P<id>\d+)'

View File

@@ -92,9 +92,9 @@ class LoomIE(InfoExtractor):
},
'params': {'videopassword': 'seniorinfants2'},
}, {
# embed, transcoded-url endpoint sends empty JSON response
# embed, transcoded-url endpoint sends empty JSON response, split video and audio HLS formats
'url': 'https://www.loom.com/embed/ddcf1c1ad21f451ea7468b1e33917e4e',
'md5': '8488817242a0db1cb2ad0ea522553cf6',
'md5': 'b321d261656848c184a94e3b93eae28d',
'info_dict': {
'id': 'ddcf1c1ad21f451ea7468b1e33917e4e',
'ext': 'mp4',
@@ -104,6 +104,7 @@ class LoomIE(InfoExtractor):
'timestamp': 1657216459,
'duration': 181,
},
'params': {'format': 'bestvideo'}, # Test video-only fixup
'expected_warnings': ['Failed to parse JSON'],
}]
_WEBPAGE_TESTS = [{
@@ -293,7 +294,11 @@ class LoomIE(InfoExtractor):
format_url = format_url.replace('-split.m3u8', '.m3u8')
m3u8_formats = self._extract_m3u8_formats(
format_url, video_id, 'mp4', m3u8_id=f'hls-{format_id}', fatal=False, quality=quality)
# Sometimes only split video/audio formats are available, need to fixup video-only formats
is_not_premerged = 'none' in traverse_obj(m3u8_formats, (..., 'vcodec'))
for fmt in m3u8_formats:
if is_not_premerged and fmt.get('vcodec') != 'none':
fmt['acodec'] = 'none'
yield {
**fmt,
'url': update_url(fmt['url'], query=query),

View File

@@ -126,7 +126,7 @@ class MailRuIE(InfoExtractor):
video_data = None
# fix meta_url if missing the host address
if re.match(r'^\/\+\/', meta_url):
if re.match(r'\/\+\/', meta_url):
meta_url = urljoin('https://my.mail.ru', meta_url)
if meta_url:

View File

@@ -13,8 +13,8 @@ from ..utils import (
class MDRIE(InfoExtractor):
IE_DESC = 'MDR.DE and KiKA'
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
IE_DESC = 'MDR.DE'
_VALID_URL = r'https?://(?:www\.)?mdr\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
_GEO_COUNTRIES = ['DE']
@@ -34,30 +34,6 @@ class MDRIE(InfoExtractor):
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
'skip': '404 not found',
}, {
'url': 'http://www.kika.de/baumhaus/videos/video19636.html',
'md5': '4930515e36b06c111213e80d1e4aad0e',
'info_dict': {
'id': '19636',
'ext': 'mp4',
'title': 'Baumhaus vom 30. Oktober 2015',
'duration': 134,
'uploader': 'KIKA',
},
'skip': '404 not found',
}, {
'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html',
'md5': '5fe9c4dd7d71e3b238f04b8fdd588357',
'info_dict': {
'id': '8182',
'ext': 'mp4',
'title': 'Beutolomäus und der geheime Weihnachtswunsch',
'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd',
'timestamp': 1482541200,
'upload_date': '20161224',
'duration': 4628,
'uploader': 'KIKA',
},
}, {
# audio with alternative playerURL pattern
'url': 'http://www.mdr.de/kultur/videos-und-audios/audio-radio/operation-mindfuck-robert-wilson100.html',
@@ -68,28 +44,7 @@ class MDRIE(InfoExtractor):
'duration': 3239,
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
}, {
# empty bitrateVideo and bitrateAudio
'url': 'https://www.kika.de/filme/sendung128372_zc-572e3f45_zs-1d9fb70e.html',
'info_dict': {
'id': '128372',
'ext': 'mp4',
'title': 'Der kleine Wichtel kehrt zurück',
'description': 'md5:f77fafdff90f7aa1e9dca14f662c052a',
'duration': 4876,
'timestamp': 1607823300,
'upload_date': '20201213',
'uploader': 'ZDF',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
'only_matching': True,
}, {
'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/einzelsendung2534.html',
'only_matching': True,
'skip': '404 not found',
}, {
'url': 'http://www.mdr.de/mediathek/mdr-videos/a/video-1334.html',
'only_matching': True,

View File

@@ -16,6 +16,15 @@ class MediaKlikkIE(InfoExtractor):
(?P<id>[^/#?_]+)'''
_TESTS = [{
'url': 'https://mediaklikk.hu/filmajanlo/cikk/az-ajto/',
'info_dict': {
'id': '668177',
'title': 'Az ajtó',
'display_id': 'az-ajto',
'ext': 'mp4',
'thumbnail': 'https://cdn.cms.mtv.hu/wp-content/uploads/sites/4/2016/01/vlcsnap-2023-07-31-14h18m52s111.jpg',
},
}, {
# (old) mediaklikk. date in html.
'url': 'https://mediaklikk.hu/video/hazajaro-delnyugat-bacska-a-duna-menten-palankatol-doroszloig/',
'info_dict': {
@@ -37,6 +46,7 @@ class MediaKlikkIE(InfoExtractor):
'upload_date': '20230903',
'thumbnail': 'https://mediaklikk.hu/wp-content/uploads/sites/4/2014/02/hazajarouj_JO.jpg',
},
'skip': 'Webpage redirects to 404 page',
}, {
# (old) m4sport
'url': 'https://m4sport.hu/video/2021/08/30/gyemant-liga-parizs/',
@@ -59,6 +69,7 @@ class MediaKlikkIE(InfoExtractor):
'upload_date': '20230908',
'thumbnail': 'https://m4sport.hu/wp-content/uploads/sites/4/2023/09/vlcsnap-2023-09-08-22h43m18s691.jpg',
},
'skip': 'Webpage redirects to 404 page',
}, {
# m4sport with *video/ url and no date
'url': 'https://m4sport.hu/bl-video/real-madrid-chelsea-1-1/',
@@ -69,6 +80,7 @@ class MediaKlikkIE(InfoExtractor):
'ext': 'mp4',
'thumbnail': 'https://m4sport.hu/wp-content/uploads/sites/4/2021/04/Sequence-01.Still001-1024x576.png',
},
'skip': 'Webpage redirects to 404 page',
}, {
# (old) hirado
'url': 'https://hirado.hu/videok/felteteleket-szabott-a-fovaros/',
@@ -90,6 +102,7 @@ class MediaKlikkIE(InfoExtractor):
'upload_date': '20230911',
'thumbnail': 'https://hirado.hu/wp-content/uploads/sites/4/2023/09/vlcsnap-2023-09-11-09h16m09s882.jpg',
},
'skip': 'Webpage redirects to video list page',
}, {
# (old) petofilive
'url': 'https://petofilive.hu/video/2021/06/07/tha-shudras-az-akusztikban/',
@@ -112,6 +125,7 @@ class MediaKlikkIE(InfoExtractor):
'upload_date': '20230909',
'thumbnail': 'https://petofilive.hu/wp-content/uploads/sites/4/2023/09/Clipboard11-2.jpg',
},
'skip': 'Webpage redirects to video list page',
}]
def _real_extract(self, url):
@@ -133,7 +147,9 @@ class MediaKlikkIE(InfoExtractor):
r'<p+\b[^>]+\bclass="article_date">([^<]+)<', webpage, 'upload date', default=None))
player_data['video'] = player_data.pop('token')
player_page = self._download_webpage('https://player.mediaklikk.hu/playernew/player.php', video_id, query=player_data)
player_page = self._download_webpage(
'https://player.mediaklikk.hu/playernew/player.php', video_id,
query=player_data, headers={'Referer': url})
player_json = self._search_json(
r'\bpl\.setup\s*\(', player_page, 'player json', video_id, end_pattern=r'\);')
playlist_url = traverse_obj(
@@ -141,14 +157,14 @@ class MediaKlikkIE(InfoExtractor):
if not playlist_url:
raise ExtractorError('Unable to extract playlist url')
formats = self._extract_wowza_formats(
playlist_url, video_id, skip_protocols=['f4m', 'smil', 'dash'])
formats, subtitles = self._extract_m3u8_formats_and_subtitles(playlist_url, video_id)
return {
'id': video_id,
'title': title,
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'upload_date': upload_date,
'thumbnail': player_data.get('bgImage') or self._og_search_thumbnail(webpage),
}

View File

@@ -16,7 +16,7 @@ from ..utils import (
class MGTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:w(?:ww)?\.)?mgtv\.com/(v|b)/(?:[^/]+/)*(?P<id>\d+)\.html'
_VALID_URL = r'https?://(?:w(?:ww)?\.)?mgtv\.com/[bv]/(?:[^/]+/)*(?P<id>\d+)\.html'
IE_DESC = '芒果TV'
IE_NAME = 'MangoTV'

View File

@@ -65,7 +65,7 @@ class TechTVMITIE(InfoExtractor):
class OCWMITIE(InfoExtractor):
IE_NAME = 'ocw.mit.edu'
_VALID_URL = r'^https?://ocw\.mit\.edu/courses/(?P<topic>[a-z0-9\-]+)'
_VALID_URL = r'https?://ocw\.mit\.edu/courses/(?P<topic>[a-z0-9\-]+)'
_BASE_URL = 'http://ocw.mit.edu/'
_TESTS = [

View File

@@ -1,16 +1,21 @@
import json
import re
import urllib.parse
import time
import uuid
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
join_nonempty,
jwt_decode_hs256,
parse_duration,
parse_iso8601,
try_get,
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
@@ -276,81 +281,225 @@ class MLBVideoIE(MLBBaseIE):
class MLBTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mlb\.com/tv/g(?P<id>\d{6})'
_NETRC_MACHINE = 'mlb'
_TESTS = [{
'url': 'https://www.mlb.com/tv/g661581/vee2eff5f-a7df-4c20-bdb4-7b926fa12638',
'info_dict': {
'id': '661581',
'ext': 'mp4',
'title': '2022-07-02 - St. Louis Cardinals @ Philadelphia Phillies',
'release_date': '20220702',
'release_timestamp': 1656792300,
},
'params': {
'skip_download': True,
'params': {'skip_download': 'm3u8'},
}, {
# makeup game: has multiple dates, need to avoid games with 'rescheduleDate'
'url': 'https://www.mlb.com/tv/g747039/vd22541c4-5a29-45f7-822b-635ec041cf5e',
'info_dict': {
'id': '747039',
'ext': 'mp4',
'title': '2024-07-29 - Toronto Blue Jays @ Baltimore Orioles',
'release_date': '20240729',
'release_timestamp': 1722280200,
},
'params': {'skip_download': 'm3u8'},
}]
_GRAPHQL_INIT_QUERY = '''\
mutation initSession($device: InitSessionInput!, $clientType: ClientType!, $experience: ExperienceTypeInput) {
initSession(device: $device, clientType: $clientType, experience: $experience) {
deviceId
sessionId
entitlements {
code
}
location {
countryCode
regionName
zipCode
latitude
longitude
}
clientExperience
features
}
}'''
_GRAPHQL_PLAYBACK_QUERY = '''\
mutation initPlaybackSession(
$adCapabilities: [AdExperienceType]
$mediaId: String!
$deviceId: String!
$sessionId: String!
$quality: PlaybackQuality
) {
initPlaybackSession(
adCapabilities: $adCapabilities
mediaId: $mediaId
deviceId: $deviceId
sessionId: $sessionId
quality: $quality
) {
playbackSessionId
playback {
url
token
expiration
cdn
}
}
}'''
_APP_VERSION = '7.8.2'
_device_id = None
_session_id = None
_access_token = None
_token_expiry = 0
@property
def _api_headers(self):
if (self._token_expiry - 120) <= time.time():
self.write_debug('Access token has expired; re-logging in')
self._perform_login(*self._get_login_info())
return {'Authorization': f'Bearer {self._access_token}'}
def _real_initialize(self):
if not self._access_token:
self.raise_login_required(
'All videos are only available to registered users', method='password')
def _set_device_id(self, username):
if not self._device_id:
self._device_id = self.cache.load(
self._NETRC_MACHINE, 'device_ids', default={}).get(username)
if self._device_id:
return
self._device_id = str(uuid.uuid4())
self.cache.store(self._NETRC_MACHINE, 'device_ids', {username: self._device_id})
def _perform_login(self, username, password):
data = f'grant_type=password&username={urllib.parse.quote(username)}&password={urllib.parse.quote(password)}&scope=openid offline_access&client_id=0oa3e1nutA1HLzAKG356'
access_token = self._download_json(
'https://ids.mlb.com/oauth2/aus1m088yK07noBfh356/v1/token', None,
headers={
'User-Agent': 'okhttp/3.12.1',
'Content-Type': 'application/x-www-form-urlencoded',
}, data=data.encode())['access_token']
try:
self._access_token = self._download_json(
'https://ids.mlb.com/oauth2/aus1m088yK07noBfh356/v1/token', None,
'Logging in', 'Unable to log in', headers={
'User-Agent': 'okhttp/3.12.1',
'Content-Type': 'application/x-www-form-urlencoded',
}, data=urlencode_postdata({
'grant_type': 'password',
'username': username,
'password': password,
'scope': 'openid offline_access',
'client_id': '0oa3e1nutA1HLzAKG356',
}))['access_token']
except ExtractorError as error:
if isinstance(error.cause, HTTPError) and error.cause.status == 400:
raise ExtractorError('Invalid username or password', expected=True)
raise
entitlement = self._download_webpage(
f'https://media-entitlement.mlb.com/api/v3/jwt?os=Android&appname=AtBat&did={uuid.uuid4()}', None,
headers={
'User-Agent': 'okhttp/3.12.1',
'Authorization': f'Bearer {access_token}',
})
self._token_expiry = traverse_obj(self._access_token, ({jwt_decode_hs256}, 'exp', {int})) or 0
self._set_device_id(username)
data = f'grant_type=urn:ietf:params:oauth:grant-type:token-exchange&subject_token={entitlement}&subject_token_type=urn:ietf:params:oauth:token-type:jwt&platform=android-tv'
self._access_token = self._download_json(
'https://us.edge.bamgrid.com/token', None,
self._session_id = self._call_api({
'operationName': 'initSession',
'query': self._GRAPHQL_INIT_QUERY,
'variables': {
'device': {
'appVersion': self._APP_VERSION,
'deviceFamily': 'desktop',
'knownDeviceId': self._device_id,
'languagePreference': 'ENGLISH',
'manufacturer': '',
'model': '',
'os': '',
'osVersion': '',
},
'clientType': 'WEB',
},
}, None, 'session ID')['data']['initSession']['sessionId']
def _call_api(self, data, video_id, description='GraphQL JSON', fatal=True):
return self._download_json(
'https://media-gateway.mlb.com/graphql', video_id,
f'Downloading {description}', f'Unable to download {description}', fatal=fatal,
headers={
**self._api_headers,
'Accept': 'application/json',
'Authorization': 'Bearer bWxidHYmYW5kcm9pZCYxLjAuMA.6LZMbH2r--rbXcgEabaDdIslpo4RyZrlVfWZhsAgXIk',
'Content-Type': 'application/x-www-form-urlencoded',
}, data=data.encode())['access_token']
'Content-Type': 'application/json',
'x-client-name': 'WEB',
'x-client-version': self._APP_VERSION,
}, data=json.dumps(data, separators=(',', ':')).encode())
def _extract_formats_and_subtitles(self, broadcast, video_id):
feed = traverse_obj(broadcast, ('homeAway', {str.title}))
medium = traverse_obj(broadcast, ('type', {str}))
language = traverse_obj(broadcast, ('language', {str.lower}))
format_id = join_nonempty(feed, medium, language)
response = self._call_api({
'operationName': 'initPlaybackSession',
'query': self._GRAPHQL_PLAYBACK_QUERY,
'variables': {
'adCapabilities': ['GOOGLE_STANDALONE_AD_PODS'],
'deviceId': self._device_id,
'mediaId': broadcast['mediaId'],
'quality': 'PLACEHOLDER',
'sessionId': self._session_id,
},
}, video_id, f'{format_id} broadcast JSON', fatal=False)
playback = traverse_obj(response, ('data', 'initPlaybackSession', 'playback', {dict}))
m3u8_url = traverse_obj(playback, ('url', {url_or_none}))
token = traverse_obj(playback, ('token', {str}))
if not (m3u8_url and token):
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
if 'not entitled' in errors:
raise ExtractorError(errors, expected=True)
elif errors: # Only warn when 'blacked out' since radio formats are available
self.report_warning(f'API returned errors for {format_id}: {errors}')
else:
self.report_warning(f'No formats available for {format_id} broadcast; skipping')
return [], {}
cdn_headers = {'x-cdn-token': token}
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url.replace(f'/{token}/', '/'), video_id, 'mp4',
m3u8_id=format_id, fatal=False, headers=cdn_headers)
for fmt in fmts:
fmt['http_headers'] = cdn_headers
fmt.setdefault('format_note', join_nonempty(feed, medium, delim=' '))
fmt.setdefault('language', language)
if fmt.get('vcodec') == 'none' and fmt['language'] == 'en':
fmt['source_preference'] = 10
return fmts, subs
def _real_extract(self, url):
video_id = self._match_id(url)
airings = self._download_json(
f'https://search-api-mlbtv.mlb.com/svc/search/v2/graphql/persisted/query/core/Airings?variables=%7B%22partnerProgramIds%22%3A%5B%22{video_id}%22%5D%2C%22applyEsniMediaRightsLabels%22%3Atrue%7D',
video_id)['data']['Airings']
data = self._download_json(
'https://statsapi.mlb.com/api/v1/schedule', video_id, query={
'gamePk': video_id,
'hydrate': 'broadcasts(all),statusFlags',
})
metadata = traverse_obj(data, (
'dates', ..., 'games',
lambda _, v: str(v['gamePk']) == video_id and not v.get('rescheduleDate'), any))
broadcasts = traverse_obj(metadata, (
'broadcasts', lambda _, v: v['mediaId'] and v['mediaState']['mediaStateCode'] != 'MEDIA_OFF'))
formats, subtitles = [], {}
for airing in traverse_obj(airings, lambda _, v: v['playbackUrls'][0]['href']):
format_id = join_nonempty('feedType', 'feedLanguage', from_dict=airing)
m3u8_url = traverse_obj(self._download_json(
airing['playbackUrls'][0]['href'].format(scenario='browser~csai'), video_id,
note=f'Downloading {format_id} stream info JSON',
errnote=f'Failed to download {format_id} stream info, skipping',
fatal=False, headers={
'Authorization': self._access_token,
'Accept': 'application/vnd.media-service+json; version=2',
}), ('stream', 'complete', {url_or_none}))
if not m3u8_url:
continue
f, s = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, 'mp4', m3u8_id=format_id, fatal=False)
formats.extend(f)
self._merge_subtitles(s, target=subtitles)
for broadcast in broadcasts:
fmts, subs = self._extract_formats_and_subtitles(broadcast, video_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': video_id,
'title': traverse_obj(airings, (..., 'titles', 0, 'episodeName'), get_all=False),
'is_live': traverse_obj(airings, (..., 'mediaConfig', 'productType'), get_all=False) == 'LIVE',
'title': join_nonempty(
traverse_obj(metadata, ('officialDate', {str})),
traverse_obj(metadata, ('teams', ('away', 'home'), 'team', 'name', {str}, all, {' @ '.join})),
delim=' - '),
'is_live': traverse_obj(broadcasts, (..., 'mediaState', 'mediaStateCode', {str}, any)) == 'MEDIA_ON',
'release_timestamp': traverse_obj(metadata, ('gameDate', {parse_iso8601})),
'formats': formats,
'subtitles': subtitles,
'http_headers': {'Authorization': f'Bearer {self._access_token}'},
}

View File

@@ -0,0 +1,121 @@
from .common import InfoExtractor
from ..utils import js_to_json, remove_end, update_url_query
class MojevideoIE(InfoExtractor):
IE_DESC = 'mojevideo.sk'
_VALID_URL = r'https?://(?:www\.)?mojevideo\.sk/video/(?P<id>\w+)/(?P<display_id>[\w()]+?)\.html'
_TESTS = [{
'url': 'https://www.mojevideo.sk/video/3d17c/chlapci_dobetonovali_sme_mame_hotovo.html',
'md5': '384a4628bd2bbd261c5206cf77c38c17',
'info_dict': {
'id': '3d17c',
'ext': 'mp4',
'title': 'Chlapci dobetónovali sme, máme hotovo!',
'display_id': 'chlapci_dobetonovali_sme_mame_hotovo',
'description': 'md5:a0822126044050d304a9ef58c92ddb34',
'thumbnail': 'https://fs5.mojevideo.sk/imgfb/250236.jpg',
'duration': 21.0,
'upload_date': '20230919',
'timestamp': 1695129706,
'like_count': int,
'dislike_count': int,
'view_count': int,
'comment_count': int,
},
}, {
# 720p
'url': 'https://www.mojevideo.sk/video/14677/den_blbec.html',
'md5': '517c3e111c53a67d10b429c1f344ba2f',
'info_dict': {
'id': '14677',
'ext': 'mp4',
'title': 'Deň blbec?',
'display_id': 'den_blbec',
'description': 'I maličkosť vám môže zmeniť celý deň. Nikdy nezahadzujte žuvačky na zem!',
'thumbnail': 'https://fs5.mojevideo.sk/imgfb/83575.jpg',
'duration': 100.0,
'upload_date': '20120515',
'timestamp': 1337076481,
'like_count': int,
'dislike_count': int,
'view_count': int,
'comment_count': int,
},
}, {
# 1080p
'url': 'https://www.mojevideo.sk/video/2feb2/band_maid_onset_(instrumental)_live_zepp_tokyo_(full_hd).html',
'md5': '64599a23d3ac31cf2fe069e4353d8162',
'info_dict': {
'id': '2feb2',
'ext': 'mp4',
'title': 'BAND-MAID - onset (Instrumental) Live - Zepp Tokyo (Full HD)',
'display_id': 'band_maid_onset_(instrumental)_live_zepp_tokyo_(full_hd)',
'description': 'Výborná inštrumentálna skladba od skupiny BAND-MAID.',
'thumbnail': 'https://fs5.mojevideo.sk/imgfb/196274.jpg',
'duration': 240.0,
'upload_date': '20190708',
'timestamp': 1562576592,
'like_count': int,
'dislike_count': int,
'view_count': int,
'comment_count': int,
},
}, {
# 720p
'url': 'https://www.mojevideo.sk/video/358c8/dva_nissany_skyline_strielaju_v_londyne.html',
'only_matching': True,
}, {
# 720p
'url': 'https://www.mojevideo.sk/video/2455d/gopro_hero4_session_nova_sportova_vodotesna_kamera.html',
'only_matching': True,
}, {
# 1080p
'url': 'https://www.mojevideo.sk/video/352ee/amd_rx_6800_xt_vs_nvidia_rtx_3080_(test_v_9_hrach).html',
'only_matching': True,
}, {
# 1080p
'url': 'https://www.mojevideo.sk/video/2cbeb/trailer_z_avengers_infinity_war.html',
'only_matching': True,
}]
def _real_extract(self, url):
video_id, display_id = self._match_valid_url(url).groups()
webpage = self._download_webpage(url, video_id)
video_id_dec = self._search_regex(
r'\bvId\s*=\s*(\d+)', webpage, 'video id', fatal=False) or str(int(video_id, 16))
video_exp = self._search_regex(r'\bvEx\s*=\s*["\'](\d+)', webpage, 'video expiry')
video_hashes = self._search_json(
r'\bvHash\s*=', webpage, 'video hashes', video_id,
contains_pattern=r'\[(?s:.+)\]', transform_source=js_to_json)
formats = []
for video_hash, (suffix, quality, format_note) in zip(video_hashes, [
('', 1, 'normálna kvalita'),
('_lq', 0, 'nízka kvalita'),
('_hd', 2, 'HD-720p'),
('_fhd', 3, 'FULL HD-1080p'),
('_2k', 4, '2K-1440p'),
]):
formats.append({
'format_id': f'mp4-{quality}',
'quality': quality,
'format_note': format_note,
'url': update_url_query(
f'https://cache01.mojevideo.sk/securevideos69/{video_id_dec}{suffix}.mp4', {
'md5': video_hash,
'expires': video_exp,
}),
})
return {
'id': video_id,
'display_id': display_id,
'formats': formats,
'title': (self._og_search_title(webpage, default=None)
or remove_end(self._html_extract_title(webpage, 'title'), ' - Mojevideo')),
'description': self._og_search_description(webpage),
**self._search_json_ld(webpage, video_id, default={}),
}

View File

@@ -5,39 +5,103 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
OnDemandPagedList,
determine_ext,
int_or_none,
try_get,
clean_html,
extract_attributes,
get_element_by_class,
get_element_html_by_id,
parse_count,
remove_end,
update_url,
urlencode_postdata,
)
class MurrtubeIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'''(?x)
(?:
murrtube:|
https?://murrtube\.net/videos/(?P<slug>[a-z0-9\-]+)\-
https?://murrtube\.net/(?:v/|videos/(?P<slug>[a-z0-9-]+?)-)
)
(?P<id>[a-f0-9]{8}\-[a-f0-9]{4}\-[a-f0-9]{4}\-[a-f0-9]{4}\-[a-f0-9]{12})
(?P<id>[A-Z0-9]{4}|[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12})
'''
_TEST = {
_TESTS = [{
'url': 'https://murrtube.net/videos/inferno-x-skyler-148b6f2a-fdcc-4902-affe-9c0f41aaaca0',
'md5': '169f494812d9a90914b42978e73aa690',
'md5': '70380878a77e8565d4aea7f68b8bbb35',
'info_dict': {
'id': '148b6f2a-fdcc-4902-affe-9c0f41aaaca0',
'id': 'ca885d8456b95de529b6723b158032e11115d',
'ext': 'mp4',
'title': 'Inferno X Skyler',
'description': 'Humping a very good slutty sheppy (roomate)',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 284,
'uploader': 'Inferno Wolf',
'age_limit': 18,
'thumbnail': 'https://storage.murrtube.net/murrtube-production/ekbs3zcfvuynnqfx72nn2tkokvsd',
'comment_count': int,
'view_count': int,
'like_count': int,
'tags': ['hump', 'breed', 'Fursuit', 'murrsuit', 'bareback'],
},
}
}, {
'url': 'https://murrtube.net/v/0J2Q',
'md5': '31262f6ac56f0ca75e5a54a0f3fefcb6',
'info_dict': {
'id': '8442998c52134968d9caa36e473e1a6bac6ca',
'ext': 'mp4',
'uploader': 'Hayel',
'title': 'Who\'s in charge now?',
'description': 'md5:795791e97e5b0f1805ea84573f02a997',
'age_limit': 18,
'thumbnail': 'https://storage.murrtube.net/murrtube-production/fb1ojjwiucufp34ya6hxu5vfqi5s',
'comment_count': int,
'view_count': int,
'like_count': int,
},
}]
def _extract_count(self, name, html):
return parse_count(self._search_regex(
rf'([\d,]+)\s+<span[^>]*>{name}</span>', html, name, default=None))
def _real_initialize(self):
homepage = self._download_webpage(
'https://murrtube.net', None, note='Getting session token')
self._request_webpage(
'https://murrtube.net/accept_age_check', None, 'Setting age cookie',
data=urlencode_postdata(self._hidden_inputs(homepage)))
def _real_extract(self, url):
video_id = self._match_id(url)
if video_id.startswith('murrtube:'):
raise ExtractorError('Support for murrtube: prefix URLs is broken')
video_page = self._download_webpage(url, video_id)
video_attrs = extract_attributes(get_element_html_by_id('video', video_page))
playlist = update_url(video_attrs['data-url'], query=None)
video_id = self._search_regex(r'/([\da-f]+)/index.m3u8', playlist, 'video id')
return {
'id': video_id,
'title': remove_end(self._og_search_title(video_page), ' - Murrtube'),
'age_limit': 18,
'formats': self._extract_m3u8_formats(playlist, video_id, 'mp4'),
'description': self._og_search_description(video_page),
'thumbnail': update_url(self._og_search_thumbnail(video_page, default=''), query=None) or None,
'uploader': clean_html(get_element_by_class('pl-1 is-size-6 has-text-lighter', video_page)),
'view_count': self._extract_count('Views', video_page),
'like_count': self._extract_count('Likes', video_page),
'comment_count': self._extract_count('Comments', video_page),
}
class MurrtubeUserIE(InfoExtractor):
_WORKING = False
IE_DESC = 'Murrtube user profile'
_VALID_URL = r'https?://murrtube\.net/(?P<id>[^/]+)$'
_TESTS = [{
'url': 'https://murrtube.net/stormy',
'info_dict': {
'id': 'stormy',
},
'playlist_mincount': 27,
}]
_PAGE_SIZE = 10
def _download_gql(self, video_id, op, note=None, fatal=True):
result = self._download_json(
@@ -46,73 +110,6 @@ class MurrtubeIE(InfoExtractor):
headers={'Content-Type': 'application/json'})
return result['data']
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_gql(video_id, {
'operationName': 'Medium',
'variables': {
'id': video_id,
},
'query': '''\
query Medium($id: ID!) {
medium(id: $id) {
title
description
key
duration
commentsCount
likesCount
viewsCount
thumbnailKey
tagList
user {
name
__typename
}
__typename
}
}'''})
meta = data['medium']
storage_url = 'https://storage.murrtube.net/murrtube/'
format_url = storage_url + meta.get('key', '')
thumbnail = storage_url + meta.get('thumbnailKey', '')
if determine_ext(format_url) == 'm3u8':
formats = self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native', fatal=False)
else:
formats = [{'url': format_url}]
return {
'id': video_id,
'title': meta.get('title'),
'description': meta.get('description'),
'formats': formats,
'thumbnail': thumbnail,
'duration': int_or_none(meta.get('duration')),
'uploader': try_get(meta, lambda x: x['user']['name']),
'view_count': meta.get('viewsCount'),
'like_count': meta.get('likesCount'),
'comment_count': meta.get('commentsCount'),
'tags': meta.get('tagList'),
'age_limit': 18,
}
class MurrtubeUserIE(MurrtubeIE): # XXX: Do not subclass from concrete IE
_WORKING = False
IE_DESC = 'Murrtube user profile'
_VALID_URL = r'https?://murrtube\.net/(?P<id>[^/]+)$'
_TEST = {
'url': 'https://murrtube.net/stormy',
'info_dict': {
'id': 'stormy',
},
'playlist_mincount': 27,
}
_PAGE_SIZE = 10
def _fetch_page(self, username, user_id, page):
data = self._download_gql(username, {
'operationName': 'Media',

View File

@@ -40,7 +40,6 @@ class NiconicoIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.nicovideo.jp/watch/sm22312215',
'md5': 'd1a75c0823e2f629128c43e1212760f9',
'info_dict': {
'id': 'sm22312215',
'ext': 'mp4',
@@ -56,8 +55,8 @@ class NiconicoIE(InfoExtractor):
'comment_count': int,
'genres': ['未設定'],
'tags': [],
'expected_protocol': str,
},
'params': {'skip_download': 'm3u8'},
}, {
# File downloaded with and without credentials are different, so omit
# the md5 field
@@ -77,8 +76,8 @@ class NiconicoIE(InfoExtractor):
'view_count': int,
'genres': ['音楽・サウンド'],
'tags': ['Translation_Request', 'Kagamine_Rin', 'Rin_Original'],
'expected_protocol': str,
},
'params': {'skip_download': 'm3u8'},
}, {
# 'video exists but is marked as "deleted"
# md5 is unstable
@@ -112,7 +111,6 @@ class NiconicoIE(InfoExtractor):
}, {
# video not available via `getflv`; "old" HTML5 video
'url': 'http://www.nicovideo.jp/watch/sm1151009',
'md5': 'f95a3d259172667b293530cc2e41ebda',
'info_dict': {
'id': 'sm1151009',
'ext': 'mp4',
@@ -128,11 +126,10 @@ class NiconicoIE(InfoExtractor):
'comment_count': int,
'genres': ['ゲーム'],
'tags': [],
'expected_protocol': str,
},
'params': {'skip_download': 'm3u8'},
}, {
# "New" HTML5 video
# md5 is unstable
'url': 'http://www.nicovideo.jp/watch/sm31464864',
'info_dict': {
'id': 'sm31464864',
@@ -149,12 +146,11 @@ class NiconicoIE(InfoExtractor):
'comment_count': int,
'genres': ['アニメ'],
'tags': [],
'expected_protocol': str,
},
'params': {'skip_download': 'm3u8'},
}, {
# Video without owner
'url': 'http://www.nicovideo.jp/watch/sm18238488',
'md5': 'd265680a1f92bdcbbd2a507fc9e78a9e',
'info_dict': {
'id': 'sm18238488',
'ext': 'mp4',
@@ -168,8 +164,8 @@ class NiconicoIE(InfoExtractor):
'comment_count': int,
'genres': ['エンターテイメント'],
'tags': [],
'expected_protocol': str,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://sp.nicovideo.jp/watch/sm28964488?ss_pos=1&cp_in=wt_tg',
'only_matching': True,
@@ -424,7 +420,7 @@ class NiconicoIE(InfoExtractor):
'x-request-with': 'https://www.nicovideo.jp',
})['data']['contentUrl']
# Getting all audio formats results in duplicate video formats which we filter out later
dms_fmts = self._extract_m3u8_formats(dms_m3u8_url, video_id)
dms_fmts = self._extract_m3u8_formats(dms_m3u8_url, video_id, 'mp4')
# m3u8 extraction does not provide audio bitrates, so extract from the API data and fix
for audio_fmt in traverse_obj(dms_fmts, lambda _, v: v['vcodec'] == 'none'):
@@ -436,7 +432,6 @@ class NiconicoIE(InfoExtractor):
'asr': ('samplingRate', {int_or_none}),
}), get_all=False),
'acodec': 'aac',
'ext': 'm4a',
}
# Sort before removing dupes to keep the format dicts with the lowest tbr
@@ -458,9 +453,11 @@ class NiconicoIE(InfoExtractor):
if video_id.startswith('so'):
video_id = self._match_id(handle.url)
api_data = self._parse_json(self._html_search_regex(
'data-api-data="([^"]+)"', webpage,
'API data', default='{}'), video_id)
api_data = traverse_obj(
self._parse_json(self._html_search_meta('server-response', webpage) or '', video_id),
('data', 'response', {dict}))
if not api_data:
raise ExtractorError('Server response data not found')
except ExtractorError as e:
try:
api_data = self._download_json(

View File

@@ -10,7 +10,7 @@ from ..utils import (
class NZOnScreenIE(InfoExtractor):
_VALID_URL = r'^https?://www\.nzonscreen\.com/title/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://www\.nzonscreen\.com/title/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.nzonscreen.com/title/shoop-shoop-diddy-wop-cumma-cumma-wang-dang-1982',
'info_dict': {

View File

@@ -1,9 +1,6 @@
import re
from .common import InfoExtractor
from ..utils import (
extract_attributes,
)
class NZZIE(InfoExtractor):
@@ -22,19 +19,14 @@ class NZZIE(InfoExtractor):
'playlist_count': 1,
}]
def _entries(self, webpage, page_id):
for script in re.findall(r'(?s)<script[^>]* data-hid="jw-video-jw[^>]+>(.+?)</script>', webpage):
settings = self._search_json(r'var\s+settings\s*=[^{]*', script, 'settings', page_id, fatal=False)
if entry := self._parse_jwplayer_data(settings, page_id):
yield entry
def _real_extract(self, url):
page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id)
entries = []
for player_element in re.findall(
r'(<[^>]+class="kalturaPlayer[^"]*"[^>]*>)', webpage):
player_params = extract_attributes(player_element)
if player_params.get('data-type') not in ('kaltura_singleArticle',):
self.report_warning('Unsupported player type')
continue
entry_id = player_params['data-id']
entries.append(self.url_result(
'kaltura:1750922:' + entry_id, 'Kaltura', entry_id))
return self.playlist_result(entries, page_id)
return self.playlist_result(self._entries(webpage, page_id), page_id)

View File

@@ -1,9 +1,19 @@
from .common import InfoExtractor
from ..utils import int_or_none, try_get
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
parse_qs,
try_get,
update_url,
url_or_none,
)
from ..utils.traversal import traverse_obj
class OlympicsReplayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?olympics\.com(?:/tokyo-2020)?/[a-z]{2}/(?:replay|video)/(?P<id>[^/#&?]+)'
_VALID_URL = r'https?://(?:www\.)?olympics\.com/[a-z]{2}/(?:paris-2024/)?(?:replay|videos?|original-series/episode)/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://olympics.com/fr/video/men-s-109kg-group-a-weightlifting-tokyo-2020-replays',
'info_dict': {
@@ -11,26 +21,105 @@ class OlympicsReplayIE(InfoExtractor):
'ext': 'mp4',
'title': '+109kg (H) Groupe A - Haltérophilie | Replay de Tokyo 2020',
'upload_date': '20210801',
'timestamp': 1627783200,
'timestamp': 1627797600,
'description': 'md5:c66af4a5bc7429dbcc43d15845ff03b3',
'uploader': 'International Olympic Committee',
},
'params': {
'skip_download': True,
'thumbnail': 'https://img.olympics.com/images/image/private/t_1-1_1280/primary/nua4o7zwyaznoaejpbk2',
'duration': 7017.0,
},
}, {
'url': 'https://olympics.com/tokyo-2020/en/replay/bd242924-4b22-49a5-a846-f1d4c809250d/mens-bronze-medal-match-hun-esp',
'only_matching': True,
'url': 'https://olympics.com/en/original-series/episode/b-boys-and-b-girls-take-the-spotlight-breaking-life-road-to-paris-2024',
'info_dict': {
'id': '32633650-c5ee-4280-8b94-fb6defb6a9b5',
'ext': 'mp4',
'title': 'B-girl Nicka - Breaking Life, Road to Paris 2024 | Episode 1',
'upload_date': '20240517',
'timestamp': 1715948200,
'description': 'md5:f63d728a41270ec628f6ac33ce471bb1',
'thumbnail': 'https://img.olympics.com/images/image/private/t_1-1_1280/primary/a3j96l7j6so3vyfijby1',
'duration': 1321.0,
},
}, {
'url': 'https://olympics.com/en/paris-2024/videos/men-s-preliminaries-gbr-esp-ned-rsa-hockey-olympic-games-paris-2024',
'info_dict': {
'id': '3d96db23-8eee-4b7c-8ef5-488a0361026c',
'ext': 'mp4',
'title': 'Men\'s Preliminaries GBR-ESP & NED-RSA | Hockey | Olympic Games Paris 2024',
'upload_date': '20240727',
'timestamp': 1722066600,
},
'skip': 'Geo-restricted to RU, BR, BT, NP, TM, BD, TL',
}, {
'url': 'https://olympics.com/en/paris-2024/videos/dnp-suni-lee-i-have-goals-and-i-have-expectations-for-myself-but-i-also-am-trying-to-give-myself-grace',
'info_dict': {
'id': 'a42f37ab-8a74-41d0-a7d9-af27b7b02a90',
'ext': 'mp4',
'title': 'md5:c7cfbc9918636a98e66400a812e4d407',
'upload_date': '20240729',
'timestamp': 1722288600,
},
}]
_GEO_BYPASS = False
def _extract_from_nextjs_data(self, webpage, video_id):
data = traverse_obj(self._search_nextjs_data(webpage, video_id, default={}), (
'props', 'pageProps', 'page', 'items',
lambda _, v: v['name'] == 'videoPlaylist', 'data', 'currentVideo', {dict}, any))
if not data:
return None
geo_countries = traverse_obj(data, ('countries', ..., {str}))
if traverse_obj(data, ('geoRestrictedVideo', {bool})):
self.raise_geo_restricted(countries=geo_countries)
is_live = traverse_obj(data, ('streamingStatus', {str})) == 'LIVE'
m3u8_url = traverse_obj(data, ('videoUrl', {url_or_none})) or data['streamUrl']
tokenized_url = self._tokenize_url(m3u8_url, data['jwtToken'], is_live, video_id)
try:
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
tokenized_url, video_id, 'mp4', m3u8_id='hls')
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and 'georestricted' in e.cause.msg:
self.raise_geo_restricted(countries=geo_countries)
raise
return {
'formats': formats,
'subtitles': subtitles,
'is_live': is_live,
**traverse_obj(data, {
'id': ('videoID', {str}),
'title': ('title', {str}),
'timestamp': ('contentDate', {parse_iso8601}),
}),
}
def _tokenize_url(self, url, token, is_live, video_id):
return self._download_json(
'https://metering.olympics.com/tokengenerator', video_id,
'Downloading tokenized m3u8 url', query={
**parse_qs(url),
'url': update_url(url, query=None),
'service-id': 'live' if is_live else 'vod',
'user-auth': token,
})['data']['url']
def _legacy_tokenize_url(self, url, video_id):
return self._download_json(
'https://olympics.com/tokenGenerator', video_id,
'Downloading legacy tokenized m3u8 url', query={'url': url})
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if info := self._extract_from_nextjs_data(webpage, video_id):
return info
title = self._html_search_meta(('title', 'og:title', 'twitter:title'), webpage)
uuid = self._html_search_meta('episode_uid', webpage)
video_uuid = self._html_search_meta('episode_uid', webpage)
m3u8_url = self._html_search_meta('video_url', webpage)
json_ld = self._search_json_ld(webpage, uuid)
json_ld = self._search_json_ld(webpage, video_uuid)
thumbnails_list = json_ld.get('image')
if not thumbnails_list:
thumbnails_list = self._html_search_regex(
@@ -48,12 +137,12 @@ class OlympicsReplayIE(InfoExtractor):
'width': width,
'height': int_or_none(try_get(width, lambda x: x * height_a / width_a)),
})
m3u8_url = self._download_json(
f'https://olympics.com/tokenGenerator?url={m3u8_url}', uuid, note='Downloading m3u8 url')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(m3u8_url, uuid, 'mp4', m3u8_id='hls')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
self._legacy_tokenize_url(m3u8_url, video_uuid), video_uuid, 'mp4', m3u8_id='hls')
return {
'id': uuid,
'id': video_uuid,
'title': title,
'thumbnails': thumbnails,
'formats': formats,

View File

@@ -420,7 +420,7 @@ class PatreonIE(PatreonBaseIE):
class PatreonCampaignIE(PatreonBaseIE):
_VALID_URL = r'https?://(?:www\.)?patreon\.com/(?!rss)(?:(?:m/(?P<campaign_id>\d+))|(?P<vanity>[-\w]+))'
_VALID_URL = r'https?://(?:www\.)?patreon\.com/(?!rss)(?:(?:m|api/campaigns)/(?P<campaign_id>\d+)|(?P<vanity>[-\w]+))'
_TESTS = [{
'url': 'https://www.patreon.com/dissonancepod/',
'info_dict': {
@@ -442,25 +442,44 @@ class PatreonCampaignIE(PatreonBaseIE):
'url': 'https://www.patreon.com/m/4767637/posts',
'info_dict': {
'title': 'Not Just Bikes',
'channel_follower_count': int,
'id': '4767637',
'channel_id': '4767637',
'channel_url': 'https://www.patreon.com/notjustbikes',
'description': 'md5:595c6e7dca76ae615b1d38c298a287a1',
'description': 'md5:9f4b70051216c4d5c58afe580ffc8d0f',
'age_limit': 0,
'channel': 'Not Just Bikes',
'uploader_url': 'https://www.patreon.com/notjustbikes',
'uploader': 'Not Just Bikes',
'uploader': 'Jason',
'uploader_id': '37306634',
'thumbnail': r're:^https?://.*$',
},
'playlist_mincount': 71,
}, {
'url': 'https://www.patreon.com/api/campaigns/4243769/posts',
'info_dict': {
'title': 'Second Thought',
'channel_follower_count': int,
'id': '4243769',
'channel_id': '4243769',
'channel_url': 'https://www.patreon.com/secondthought',
'description': 'md5:69c89a3aba43efdb76e85eb023e8de8b',
'age_limit': 0,
'channel': 'Second Thought',
'uploader_url': 'https://www.patreon.com/secondthought',
'uploader': 'JT Chapman',
'uploader_id': '32718287',
'thumbnail': r're:^https?://.*$',
},
'playlist_mincount': 201,
}, {
'url': 'https://www.patreon.com/dissonancepod/posts',
'only_matching': True,
}, {
'url': 'https://www.patreon.com/m/5932659',
'only_matching': True,
}, {
'url': 'https://www.patreon.com/api/campaigns/4243769',
'only_matching': True,
}]
@classmethod

View File

@@ -5,6 +5,7 @@ from ..utils import (
ExtractorError,
str_or_none,
traverse_obj,
update_url,
)
@@ -43,15 +44,16 @@ class PicartoIE(InfoExtractor):
url
}
}''' % (channel_id, channel_id), # noqa: UP031
})['data']
}, headers={'Accept': '*/*', 'Content-Type': 'application/json'})['data']
metadata = data['channel']
if metadata.get('online') == 0:
raise ExtractorError('Stream is offline', expected=True)
title = metadata['title']
cdn_data = self._download_json(
data['getLoadBalancerUrl']['url'] + '/stream/json_' + metadata['stream_name'] + '.js',
cdn_data = self._download_json(''.join((
update_url(data['getLoadBalancerUrl']['url'], scheme='https'),
'/stream/json_', metadata['stream_name'], '.js')),
channel_id, 'Downloading load balancing info')
formats = []
@@ -99,10 +101,10 @@ class PicartoVodIE(InfoExtractor):
},
'skip': 'The VOD does not exist',
}, {
'url': 'https://picarto.tv/ArtofZod/videos/772650',
'md5': '00067a0889f1f6869cc512e3e79c521b',
'url': 'https://picarto.tv/ArtofZod/videos/771008',
'md5': 'abef5322f2700d967720c4c6754b2a34',
'info_dict': {
'id': '772650',
'id': '771008',
'ext': 'mp4',
'title': 'Art of Zod - Drawing and Painting',
'thumbnail': r're:^https?://.*\.jpg',
@@ -131,7 +133,7 @@ class PicartoVodIE(InfoExtractor):
}}
}}
}}''',
})['data']['video']
}, headers={'Accept': '*/*', 'Content-Type': 'application/json'})['data']['video']
file_name = data['file_name']
netloc = urllib.parse.urlparse(data['video_recording_image_url']).netloc

View File

@@ -109,7 +109,7 @@ class PinterestBaseIE(InfoExtractor):
class PinterestIE(PinterestBaseIE):
_VALID_URL = rf'{PinterestBaseIE._VALID_URL_BASE}/pin/(?P<id>\d+)'
_VALID_URL = rf'{PinterestBaseIE._VALID_URL_BASE}/pin/(?:[\w-]+--)?(?P<id>\d+)'
_TESTS = [{
# formats found in data['videos']
'url': 'https://www.pinterest.com/pin/664281013778109217/',
@@ -174,6 +174,25 @@ class PinterestIE(PinterestBaseIE):
}, {
'url': 'https://co.pinterest.com/pin/824721750502199491/',
'only_matching': True,
},
{
'url': 'https://pinterest.com/pin/dive-into-serenity-blue-lagoon-pedi-nails-for-a-tranquil-and-refreshing-spa-experience-video-in-2024--2885187256207927',
'info_dict': {
'id': '2885187256207927',
'ext': 'mp4',
'title': 'Dive into Serenity: Blue Lagoon Pedi Nails for a Tranquil and Refreshing Spa Experience! 💙💅',
'description': 'md5:5da41c767d2317e42e49b663b0b2150f',
'uploader': 'Glamour Artistry |Everyday Outfits, Luxury Fashion & Nail Designs',
'uploader_id': '1142999717836434688',
'upload_date': '20240702',
'timestamp': 1719939156,
'duration': 7.967,
'comment_count': int,
'repost_count': int,
'categories': 'count:9',
'tags': ['#BlueLagoonPediNails', '#SpaExperience'],
'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
},
}]
def _real_extract(self, url):

View File

@@ -628,8 +628,7 @@ class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
page_entries = self._extract_entries(webpage, host)
if not page_entries:
break
for e in page_entries:
yield e
yield from page_entries
if not self._has_more(webpage):
break

View File

@@ -7,6 +7,7 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
join_nonempty,
time_seconds,
try_call,
unified_timestamp,
@@ -167,7 +168,7 @@ class RadikoBaseIE(InfoExtractor):
class RadikoIE(RadikoBaseIE):
_VALID_URL = r'https?://(?:www\.)?radiko\.jp/#!/ts/(?P<station>[A-Z0-9-]+)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?radiko\.jp/#!/ts/(?P<station>[A-Z0-9-]+)/(?P<timestring>\d+)'
_TESTS = [{
# QRR (文化放送) station provides <desc>
@@ -183,8 +184,9 @@ class RadikoIE(RadikoBaseIE):
}]
def _real_extract(self, url):
station, video_id = self._match_valid_url(url).groups()
vid_int = unified_timestamp(video_id, False)
station, timestring = self._match_valid_url(url).group('station', 'timestring')
video_id = join_nonempty(station, timestring)
vid_int = unified_timestamp(timestring, False)
prog, station_program, ft, radio_begin, radio_end = self._find_program(video_id, station, vid_int)
auth_token, area_id = self._auth_client()
@@ -207,7 +209,7 @@ class RadikoIE(RadikoBaseIE):
'ft': radio_begin,
'end_at': radio_end,
'to': radio_end,
'seek': video_id,
'seek': timestring,
},
),
}

View File

@@ -16,7 +16,7 @@ from ..utils import (
class RadioFranceIE(InfoExtractor):
_VALID_URL = r'^https?://maison\.radiofrance\.fr/radiovisions/(?P<id>[^?#]+)'
_VALID_URL = r'https?://maison\.radiofrance\.fr/radiovisions/(?P<id>[^?#]+)'
IE_NAME = 'radiofrance'
_TEST = {

View File

@@ -6,7 +6,7 @@ from ..utils import (
class ReverbNationIE(InfoExtractor):
_VALID_URL = r'^https?://(?:www\.)?reverbnation\.com/.*?/song/(?P<id>\d+).*?$'
_VALID_URL = r'https?://(?:www\.)?reverbnation\.com/.*?/song/(?P<id>\d+).*?$'
_TESTS = [{
'url': 'http://www.reverbnation.com/alkilados/song/16965047-mona-lisa',
'md5': 'c0aaf339bcee189495fdf5a8c8ba8645',

View File

@@ -8,7 +8,7 @@ from ..utils import js_to_json
class RTPIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?'
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/(?:(?:estudoemcasa|palco|zigzag)/)?p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
'md5': 'e736ce0c665e459ddb818546220b4ef8',
@@ -19,9 +19,25 @@ class RTPIE(InfoExtractor):
'description': 'As paixões musicais de António Cartaxo e António Macedo',
'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'https://www.rtp.pt/play/zigzag/p13166/e757904/25-curiosidades-25-de-abril',
'md5': '9a81ed53f2b2197cfa7ed455b12f8ade',
'info_dict': {
'id': 'e757904',
'ext': 'mp4',
'title': '25 Curiosidades, 25 de Abril',
'description': 'Estudar ou não estudar - Em cada um dos episódios descobrimos uma curiosidade acerca de como era viver em Portugal antes da revolução do 25 de abr',
'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
'only_matching': True,
}, {
'url': 'https://www.rtp.pt/play/estudoemcasa/p7776/portugues-1-ano',
'only_matching': True,
}, {
'url': 'https://www.rtp.pt/play/palco/p13785/l7nnon',
'only_matching': True,
}]
_RX_OBFUSCATION = re.compile(r'''(?xs)
@@ -49,17 +65,17 @@ class RTPIE(InfoExtractor):
f, config = self._search_regex(
r'''(?sx)
var\s+f\s*=\s*(?P<f>".*?"|{[^;]+?});\s*
(?:var\s+f\s*=\s*(?P<f>".*?"|{[^;]+?});\s*)?
var\s+player1\s+=\s+new\s+RTPPlayer\s*\((?P<config>{(?:(?!\*/).)+?})\);(?!\s*\*/)
''', webpage,
'player config', group=('f', 'config'))
f = self._parse_json(
f, video_id,
lambda data: self.__unobfuscate(data, video_id=video_id))
config = self._parse_json(
config, video_id,
lambda data: self.__unobfuscate(data, video_id=video_id))
f = config['file'] if not f else self._parse_json(
f, video_id,
lambda data: self.__unobfuscate(data, video_id=video_id))
formats = []
if isinstance(f, dict):

View File

@@ -8,14 +8,17 @@ from ..utils import (
UnsupportedError,
clean_html,
determine_ext,
extract_attributes,
format_field,
get_element_by_class,
get_elements_html_by_class,
int_or_none,
join_nonempty,
parse_count,
parse_iso8601,
traverse_obj,
unescapeHTML,
urljoin,
)
@@ -382,8 +385,10 @@ class RumbleChannelIE(InfoExtractor):
if isinstance(e.cause, HTTPError) and e.cause.status == 404:
break
raise
for video_url in re.findall(r'class="[^>"]*videostream__link[^>]+href="([^"]+\.html)"', webpage):
yield self.url_result('https://rumble.com' + video_url)
for video_url in traverse_obj(
get_elements_html_by_class('videostream__link', webpage), (..., {extract_attributes}, 'href'),
):
yield self.url_result(urljoin('https://rumble.com', video_url))
def _real_extract(self, url):
url, playlist_id = self._match_valid_url(url).groups()

View File

@@ -6,6 +6,7 @@ from ..utils import (
determine_ext,
int_or_none,
parse_qs,
traverse_obj,
try_get,
unified_timestamp,
url_or_none,
@@ -80,6 +81,8 @@ class RutubeBaseIE(InfoExtractor):
'url': format_url,
'format_id': format_id,
})
for hls_url in traverse_obj(options, ('live_streams', 'hls', ..., 'url', {url_or_none})):
formats.extend(self._extract_m3u8_formats(hls_url, video_id, ext='mp4', fatal=False))
return formats
def _download_and_extract_formats(self, video_id, query=None):
@@ -90,7 +93,7 @@ class RutubeBaseIE(InfoExtractor):
class RutubeIE(RutubeBaseIE):
IE_NAME = 'rutube'
IE_DESC = 'Rutube videos'
_VALID_URL = r'https?://rutube\.ru/(?:video(?:/private)?|(?:play/)?embed)/(?P<id>[\da-z]{32})'
_VALID_URL = r'https?://rutube\.ru/(?:(?:live/)?video(?:/private)?|(?:play/)?embed)/(?P<id>[\da-z]{32})'
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//rutube\.ru/(?:play/)?embed/[\da-z]{32}.*?)\1']
_TESTS = [{
@@ -164,6 +167,29 @@ class RutubeIE(RutubeBaseIE):
'uploader': 'Стас Быков',
},
'expected_warnings': ['Unable to download f4m'],
}, {
'url': 'https://rutube.ru/live/video/c58f502c7bb34a8fcdd976b221fca292/',
'info_dict': {
'id': 'c58f502c7bb34a8fcdd976b221fca292',
'ext': 'mp4',
'categories': ['Телепередачи'],
'description': '',
'thumbnail': 'http://pic.rutubelist.ru/video/14/19/14190807c0c48b40361aca93ad0867c7.jpg',
'live_status': 'is_live',
'age_limit': 0,
'uploader_id': '23460655',
'timestamp': 1652972968,
'view_count': int,
'upload_date': '20220519',
'title': r're:Первый канал. Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'uploader': 'Первый канал',
},
}, {
'url': 'https://rutube.ru/video/5ab908fccfac5bb43ef2b1e4182256b0/',
'only_matching': True,
}, {
'url': 'https://rutube.ru/live/video/private/c58f502c7bb34a8fcdd976b221fca292/',
'only_matching': True,
}]
@classmethod

Some files were not shown because too many files have changed in this diff Show More