Compare commits

...

281 Commits

Author SHA1 Message Date
github-actions
dee1d65dc3 [version] update
Created by: pukkandan

:ci skip all
2022-04-08 09:57:06 +00:00
pukkandan
7884ade65e Release 2022.04.08 2022-04-08 15:21:27 +05:30
Jacek Nowacki
89fabf1125 [bilibili] Fix extraction of title with quotes (#3350)
Closes #3289
Authored by: dzek69
2022-04-08 02:21:37 -07:00
pukkandan
11e1c2e3f8 [TikTokVM] Fix redirect to user URL
Closes #3349, Closes #3351
2022-04-08 14:46:45 +05:30
pukkandan
ebc7d3ff1f [docs] Minor improvements (#3309, #3343)
Authored by: cffswb, danielyli


Co-authored-by: Daniel Li <dan@danielyli.com>
Co-authored-by: cffswb <karte577@gmail.com>
2022-04-08 14:09:10 +05:30
pukkandan
d8a58ddce7 De-prioritize automatic-subtitles when no --sub-lang is given
Closes #3314
2022-04-08 14:01:23 +05:30
mehq
4d57133095 [Jable] Add extractor (#3341)
Closes #3284
Authored by: mehq
2022-04-07 23:49:14 -07:00
Alexander Seiler
9b8b7a7b5e [Zattoo] Fix extractors (#2288)
Closes: #1244
Authored by: goggle
2022-04-07 23:44:58 -07:00
Ha Tien Loi
ab0970b233 [NRK] Extract timestamp (#3231)
Closes #3211
Authored by: hatienl0i261299
2022-04-07 08:52:27 -07:00
Lesmiscore
b52e788eb2 [Piapro] Extract description with break lines
Authored by: Lesmiscore
Closes #3334
2022-04-07 20:21:42 +09:00
pukkandan
316f2650f8 Ignore mhtml formats from -f mergeall
Closes #3324
2022-04-07 16:42:14 +05:30
Ha Tien Loi
bd4073c535 [AfreecaTV] Add AfreecaTVUserIE (#3286)
Closes #3257
Authored by: hatienl0i261299
2022-04-07 04:03:13 -07:00
pukkandan
22fba53fbd [FfmpegMetadata] Write id3v1 tags 2022-04-07 15:51:23 +05:30
coletdev
61d3665d9d [youtube] Fix uploader for collaborative playlists (#3332)
Authored by: coletdjnz
2022-04-07 01:11:16 -07:00
Lesmiscore (Naoya Ozaki)
870efdee28 [TVer] Fix extractor (#3268)
Authored by: Lesmiscore
2022-04-07 16:19:36 +09:00
pukkandan
b506289fe2 [test] Add test_locked_file 2022-04-07 12:05:44 +05:30
pukkandan
b63837bce0 [utils] locked_file: Fix non-blocking non-exclusive lock 2022-04-07 12:02:13 +05:30
Justin Keogh
fcfa8853e4 [utils] locked_file: Do not truncate files before locking (#2994)
Authored by: jakeogh, pukkandan
2022-04-06 22:58:56 -07:00
Lesmiscore (Naoya Ozaki)
06b1628d3e [twitcasting] Don't return multi_video for archive with single hls manifest (#3319)
Authored by: Lesmiscore
2022-04-07 13:42:01 +09:00
panatexxa
da1ffde15d [Moviepilot] Add extractor (#3282)
Authored by: panatexxa
2022-04-06 19:26:12 -07:00
Ha Tien Loi
42a4f21a03 [fptplay] Fix metadata extraction (#3218)
Authored by: hatienl0i261299
2022-04-06 01:52:08 -07:00
pukkandan
8973767198 Do not lock downloading file on Windows
Closes #3124
2022-04-05 23:32:22 +05:30
pukkandan
0edb3e336c Do not prevent download if locking is unsupported
Closes #3022

Failure to lock download-archive is still fatal.
This is consistent with youtube-dl's behavior
2022-04-05 23:32:22 +05:30
pukkandan
ce0593ef61 [http] Fix #3215 2022-04-05 21:31:59 +05:30
pukkandan
a44ca5a470 [cleanup] Misc fixes
Closes https://github.com/yt-dlp/yt-dlp/pull/3213, Closes https://github.com/yt-dlp/yt-dlp/pull/3117

Related: https://github.com/yt-dlp/yt-dlp/issues/3146#issuecomment-1077323114, https://github.com/yt-dlp/yt-dlp/pull/3277#discussion_r841019671, a825ffbffa (commitcomment-68538986), https://github.com/yt-dlp/yt-dlp/issues/2360, 5fa3c9a88f (r70393519), 5fa3c9a88f (r70393254)
2022-04-05 18:12:18 +05:30
Teemu Ikonen
0a8a7e68fa [ruutu] Detect embeds (#3294)
Authored by: tpikonen
2022-04-05 05:15:47 -07:00
Jeff Huffman
f4d706a931 [crunchyroll:playlist] Implement beta API (#2955)
Closes #3121, #2930

Authored by: tejing1
2022-04-05 03:51:12 -07:00
Ha Tien Loi
5fa3c9a88f [TikTok] Fix URLs with user id (#3295)
Closes #3243
Authored by: hatienl0i261299
2022-04-04 03:07:07 -07:00
pukkandan
04f3fd2c89 [cleanup] Use _html_extract_title 2022-04-04 15:13:30 +05:30
pukkandan
85e801a9db Fallback to video-only format when selecting by extension
Closes #3296
2022-04-04 15:13:21 +05:30
pukkandan
5127e92a94 Fix filepath sanitization in --print-to-file 2022-04-04 12:59:44 +05:30
Ha Tien Loi
18eac302a2 [Imdb] Improve extractor (#3291)
Closes #3283
Authored by: hatienl0i261299
2022-04-04 00:29:35 -07:00
Tim Schindler
12e022d074 [Cybrary] Add extractor (#3264)
Authored by: aaearon
2022-04-04 00:20:14 -07:00
Lesmiscore (Naoya Ozaki)
265e586d96 [openrec] Download archived livestreams (#3267)
Authored by: Lesmiscore
2022-04-04 00:41:14 +09:00
Fam0r
fbfde1c3e6 [elonet] Rewrite extractor (#3277)
Closes #2911
Authored by: Fam0r, pukkandan
2022-04-03 08:11:50 -07:00
aarubui
dc57e74a7f [tenplay] Improve extractor (#3280)
Authored by: aarubui
2022-04-03 06:53:22 -07:00
pukkandan
a17526e427 [youtube:tab] Minor improvements (See desc)
* Support shorts on channel homepage
* Extract thumbnail of OLAK playlists
2022-04-03 19:01:03 +05:30
coletdev
ad210f4fd4 [youtube:search] Support hashtag entries (#3265)
Authored-by: coletdjnz
2022-04-02 06:11:14 +00:00
coletdjnz
c8e856a551 [web.archive:youtube] Make CDX API requests non-fatal
Partial fix for https://github.com/yt-dlp/yt-dlp/issues/3278
Authored-by: coletdjnz
2022-04-02 19:07:13 +13:00
nixxo
c085e4ec47 [rai] Fix extraction of http formats (#3272)
Closes #3270
Authored by: nixxo
2022-04-01 22:57:56 -07:00
pukkandan
4c268f9cb7 [Nebula] Fix bug in 52efa4b312 2022-04-02 11:22:17 +05:30
Lesmiscore (Naoya Ozaki)
5d45484cc7 [niconico] Fix extraction of thumbnails and uploader (#3266) 2022-04-01 19:31:58 +09:00
pukkandan
e6f868a63c [utils] traverse_obj: Allow filtering by value 2022-03-31 13:33:28 +05:30
pukkandan
c4f60dd7cd [utils] Add try_call 2022-03-31 13:33:27 +05:30
pukkandan
f189faf1ce [BRMediathek] Fix VALID_URL
Closes #2466
2022-03-31 13:33:17 +05:30
Alexander Seiler
504f789ad5 [AZMedien] Support tv.telezueri.ch (#3251)
Authored by: goggle
2022-03-30 20:23:32 -07:00
Bricio
bb5a7cb8ad [Craftsy] Add extractor (#3208)
Authored by: Bricio
2022-03-30 20:04:55 -07:00
zackmark29
c418e6b5a6 [viu] Fix bypass for preview (#3247)
Authored by: zackmark29
2022-03-30 19:47:58 -07:00
pukkandan
11078c6d57 [crunhyroll] Fix inheritance
https://github.com/yt-dlp/yt-dlp/pull/2955#issuecomment-1083060465
2022-03-30 18:19:51 +05:30
MrRawes
5d0aeac0e9 [docs] Clarify the exact BSD license of dependencies (#3197)
Authored by: MrRawes
2022-03-30 04:35:06 -07:00
Felix S
180c81509f [docs] Add an .editorconfig file (#3220)
Authored by: fstirlitz
2022-03-30 04:31:25 -07:00
Daniel
ab2579bb45 [xnxx] Add xnxx3.com (#3188)
Authored by: rozari0
2022-03-30 03:54:35 -07:00
Ha Tien Loi
48e15bb6b1 [dailymotion] Support geo.dailymotion.com (#3230)
Closes #3229
Authored by: hatienl0i261299
2022-03-30 03:04:00 -07:00
pukkandan
af4944d84b Fix bug in 8a7f68d0b1
Closes #3241
2022-03-30 12:22:36 +05:30
David
e7870111e8 [YouTube] Add new age-gate bypass (#3233)
Closes #3182
Authored by: zerodytrash, pukkandan
2022-03-29 03:05:31 -07:00
pukkandan
8a7f68d0b1 [ffmpeg] Cache version data
Related: https://github.com/dasl-/pifi/issues/9
2022-03-29 03:44:51 +05:30
Ha Tien Loi
9139d2fae0 [WasdTV] Add extractor (#3045)
Closes #3041
Authored by: un-def, hatienl0i261299
2022-03-27 20:27:41 -07:00
nyuszika7h
bdd60588b0 [viki] Don't attempt to modify URLs with signature (#3222)
Closes #1379
Authored by: nyuszika7h
2022-03-27 20:23:44 -07:00
Luc Ritchie
f5f15c9993 [BiliIntl] Support user-generated videos (#3203)
Authored by: wlritchi
2022-03-27 20:21:42 -07:00
pukkandan
cb96c5be70 Fix --no-overwrite for playlist infojson
Fixes: https://github.com/yt-dlp/yt-dlp/issues/1467#issuecomment-1079922971
2022-03-28 08:45:23 +05:30
pukkandan
90137ca4be [utils] Add filter_dict 2022-03-28 08:25:04 +05:30
coletdev
1c1b2f96ae [youtube:tab] Fix duration extraction for shorts (#3171)
Related: https://github.com/TeamNewPipe/NewPipe/issues/8034
Authored-by: coletdjnz
2022-03-28 00:49:42 +00:00
Felix S
47b8bf207b [go,viu] Extract subtitles from the m3u8 manifest (#3219)
Authored by: fstirlitz
2022-03-27 02:35:14 -07:00
Tim Schindler
4628a3aa75 [ITProTV] Add extractor (#3196)
Authored by: aaearon
2022-03-27 02:00:38 -07:00
mehq
5b4bb715e6 [BanBye] Add extractor (#3177)
Closes #3175
Authored by: mehq
2022-03-27 01:57:05 -07:00
pukkandan
1235d333ab [youtube] Fix auto-translated automatic captions
d49669acad only covered ASR

Closes #2956
2022-03-27 14:06:26 +05:30
pukkandan
18e4940825 [youtube] Add extractor-arg to skip auto-translated subs 2022-03-27 14:04:20 +05:30
pukkandan
c0b6e5c74d Show warning when all media formats have DRM
Related: #1379
2022-03-27 11:39:35 +05:30
shirt
727029c508 [youtube] Detect DRM better
Authored by: shirt-dev
2022-03-27 11:27:27 +05:30
pukkandan
5c3895fff1 [outtmpl] Limit changes during sanitization
Closes #2761
2022-03-27 11:18:35 +05:30
coletdev
fd2ad7cb24 [youtube:tab] Return shorts url if video is a short (#3168)
Allows filtering out shorts from feeds with `--match-filter`
Closes #3165
Authored-by: coletdjnz
2022-03-27 05:20:25 +00:00
pukkandan
4a3175fc4c [VideoConvertor] Ensure all streams are copied
Closes #3200
2022-03-27 09:28:58 +05:30
pukkandan
5cf34021f5 [Concat] Ensure final directory exists
Fixes https://github.com/yt-dlp/yt-dlp/issues/3181#issuecomment-1079622589
2022-03-27 04:52:11 +05:30
pukkandan
34baa9fdf0 [outtmpl] Fix replacement/default when used with alternate 2022-03-26 07:39:59 +05:30
pukkandan
6db9c4d57d Ignore format-specific fields in initial pass of --match-filter
Closes #3074
2022-03-25 14:27:09 +05:30
Lesmiscore (Naoya Ozaki)
3cea3edd1a [utils] WebSocketsWrapper: Allow omitting __enter__ invocation (#3187)
Authored by: Lesmiscore
2022-03-25 17:24:39 +09:00
pukkandan
b1a7cd056a Treat multiple --match-filters as OR
Closes #3144
2022-03-25 13:33:46 +05:30
pukkandan
28787f16c6 [downloader] Fix invocation of HttpieFD
Closes #3154
2022-03-25 13:00:42 +05:30
zackmark29
1fb707badb [viu] Fixed extractor (#3136)
Closes #3133
Authored by: zackmark29, pukkandan
2022-03-24 20:23:54 -07:00
pukkandan
a3f2445e29 [postprocessor,cleanup] Create _download_json 2022-03-25 08:45:35 +05:30
pukkandan
ae72962643 [youtube] Try embedded client variants before agegate
agegate variants appears to be broken, but don't remove them for the time-being
2022-03-25 05:00:41 +05:30
pukkandan
ae6a1b9585 [docs] Minor improvements
Closes #3127, Closes #3081, Closes #3177
2022-03-24 07:30:25 +05:30
pukkandan
231025c463 Fix bug in 52efa4b312
Closes #3173
2022-03-24 07:28:10 +05:30
pukkandan
700ccbe3f1 [extractor] Allow control characters inside json
Closes #3174
2022-03-24 07:28:07 +05:30
vvto33
12a64f2777 [TVer] Support landing page (#3075)
Authored by: vvto33
2022-03-23 18:11:13 -07:00
mehq
b8f2f8f6b3 [LastFM] Add extractors (#3141)
Closes #2967
Authored by: mehq
2022-03-23 11:35:42 -07:00
coletdev
af14914baa Remove Accept-Encoding header from std_headers (#3153)
This should be set by each downloader to what it supports.
Fixes https://github.com/yt-dlp/yt-dlp/issues/3142
Authored-by: coletdjnz
2022-03-23 07:47:02 +00:00
pukkandan
ea5ca8e7fc [ellentube] Extract subtitles from manifest
Fixes https://github.com/ytdl-org/youtube-dl/issues/30761
2022-03-23 12:36:49 +05:30
Lesmiscore (Naoya Ozaki)
c2d2ee40eb [generic] Extract subtitles from video.js (#3156)
Authored by: Lesmiscore
2022-03-22 23:28:53 -07:00
pukkandan
c70c418d33 Fix --abort-on-error for subtitles
Closes #3163
2022-03-23 08:53:16 +05:30
pukkandan
b9c7b1e9b4 [cleanup, vimeo] Fix tests 2022-03-23 08:26:48 +05:30
coletdev
d5820461e8 Use certificates from certifi if installed (#3115)
Fixes #3102 and most `CERTIFICATE_VERIFY_FAILED` issues

Authored by: coletdjnz
2022-03-22 16:26:55 -07:00
coletdev
8a23db9519 [wget] Fix proxy (#3152)
Upstream PR: https://github.com/ytdl-org/youtube-dl/pull/29343
Authored-by: kikuyan, coletdjnz
2022-03-22 14:24:27 -07:00
CplPwnies
1f1df1251e [adobepass] Fix Suddenlink MSO (#3148)
Authored by: CplPwnies
2022-03-22 14:09:38 -07:00
1-Byte
84842aee2b [azmedien] Add TVO Online to supported hosts (#3125)
Authored by: 1-Byte
2022-03-20 10:49:00 -07:00
Lesmiscore (Naoya Ozaki)
be4685ab7b [http] Reject broken range before request (#3079)
* And fix filesize estimate for byterange downloads

Closes #2001
Authored by: Lesmiscore, Jules-A, pukkandan
2022-03-18 18:15:01 -07:00
coletdev
e6552207da [panopto] Improve subtitle extraction and support slides (#3009)
Related: #1946, #2908
Authored-by: coletdjnz
2022-03-18 22:19:36 +00:00
coletdev
a2e77303e3 [downloader/http] Retry on more errors (#3065)
Closes #3056, #2071
Related: #3034, #2969
Authored-by: coletdjnz
2022-03-18 22:10:20 +00:00
foghawk
510809f1aa [nitter] Minor fixes and update instance list (#3099)
Authored by: foghawk
2022-03-18 14:08:38 -07:00
i6t
f4ad919298 [Veo] Fix extractor (#3101)
Authored by: i6t
2022-03-18 14:06:52 -07:00
s0u1h
eeb2a770f3 [utils] format_decimal_suffix: Fix for very large numbers (#3109)
Authored by: s0u1h
2022-03-18 14:03:09 -07:00
pukkandan
0c14d66ad9 Fix autonumber
Bug in 09b49e1f68
2022-03-19 02:29:02 +05:30
pukkandan
52efa4b312 [extractor] Add _perform_login function (#2943)
* Adds new functions `_initialize_pre_login` and `_perform_login` as part of the extractor API
* Adds `ie.supports_login` to the public API
2022-03-18 13:53:33 -07:00
Luc Ritchie
028f6437f1 [afreecatv] Match new vod url (#3097)
Authored by: wlritchi
2022-03-18 02:53:07 -07:00
Sipherdrakon
43c38abd1f [ParamountPlus,CBS] Change VALID_URL (#3098)
Closes #3096

Authored by: Sipherdrakon
2022-03-18 02:49:31 -07:00
pukkandan
e4b98809cf [youtube] Fix pagination of membership tab 2022-03-18 05:23:51 +05:30
pukkandan
16c620bc55 Handle float in --wait-for-video
Closes #3082
2022-03-18 03:25:47 +05:30
pukkandan
5a373d9768 [veo] Fix _VALID_URL
Closes #3095
2022-03-18 03:01:07 +05:30
Ha Tien Loi
7e6a187096 [Huya] Add extractor (#3035)
Closes #3033
Authored by: hatienl0i261299
2022-03-17 07:24:15 -07:00
Lesmiscore (Naoya Ozaki)
3f168f0e45 [RUTV] Fix format sorting (#3085)
Closes #3084
Authored by: Lesmiscore
2022-03-17 07:11:36 -07:00
Lesmiscore (Naoya Ozaki)
7bdcb4a40e [niconico] Rewrite NiconicoIE (#3018)
Closes https://github.com/yt-dlp/yt-dlp/issues/2636, partially fixes https://github.com/yt-dlp/yt-dlp/issues/367
Authored by: Lesmiscore
2022-03-17 05:22:14 -07:00
Soebb
497a6c5f57 [daftsex] Fix extractor (#2757)
Closes #2637

Authored by: Soebb
2022-03-16 17:44:21 -07:00
BohwaZ
4b3c5d1b81 [FranceCulture] Support playlists (#1872)
Authored by: bohwaz
2022-03-16 17:40:27 -07:00
Dorian Westacott
ec47c12f69 [ParamountPlusSeries] Support multiple pages (#3026)
Authored by: dodrian
2022-03-16 16:54:20 -07:00
pukkandan
25791435b7 [arte] Add format_note to m3u8 formats
Related: #3086
2022-03-17 02:00:47 +05:30
pukkandan
4e34889f1c [rumble] unescape title 2022-03-17 01:37:04 +05:30
pukkandan
a1b2d84360 [youtube] Avoid false positives when detecting damaged formats
Closes #3083
2022-03-16 19:46:29 +05:30
coletdjnz
5dbc77df26 [youtube:api] Prefer minified JSON response
Authored-by: coletdjnz
2022-03-16 09:29:15 +13:00
Lesmiscore (Naoya Ozaki)
d71fd41249 [fragment] Read downloaded fragments only when needed (#3069)
Authored by: Lesmiscore
2022-03-15 12:27:41 +09:00
shirt
d69e55c1d8 [cleanup] Remove readthedocs from README.md 2022-03-14 12:19:33 -04:00
shirt
9f2a6352ea [docs] Remove readthedocs 2022-03-14 16:17:01 +00:00
pukkandan
aeb21b98f1 [phantomjs] Fix bug in 8b7539d27c
Closes #3066
2022-03-14 16:19:23 +05:30
coletdev
b3edc8068e [downloader/mhtml] Fix fragments with absolute urls (#3044)
Authored-by: coletdjnz
2022-03-13 22:03:40 +00:00
coletdev
17322130a9 [youtube] Improve video upload date handling (#3029)
* Don't prefer UTC upload date for past live streams/premieres
* Improve regex (fixes a regression)

Authored-by: coletdjnz
2022-03-13 22:02:44 +00:00
pukkandan
5ca764c506 [FFmpegVideoConvertor] Add more formats to --remux-video 2022-03-13 22:26:03 +05:30
pukkandan
e880c92c65 Exit after --dump-user-agent
Bug in d1b5f70bc9

Closes #3055
2022-03-13 14:38:39 +05:30
coletdjnz
a825ffbffa [extractor] Support merging subtitles with data
Authored-by: coletdjnz
2022-03-12 11:22:28 +13:00
pukkandan
592b748582 [cleanup] Minor cleanup
Closes #3006
2022-03-11 19:40:15 +05:30
pukkandan
cf4f42cb97 Protect stdout from unexpected progress and console-title
Closes #3023
2022-03-11 19:29:45 +05:30
pukkandan
da1d734fbe Remove incorrect warning for --dateafter
Closes #3030
2022-03-11 19:29:44 +05:30
pukkandan
2b38f7b2bc [MetadataParser] Validate outtmpl early 2022-03-11 19:29:43 +05:30
pukkandan
76aa991374 Fix case of http_headers
Bug in 8b7539d27c

Fixes https://github.com/yt-dlp/yt-dlp/issues/1346#issuecomment-1064527765
2022-03-11 19:29:34 +05:30
Lesmiscore (Naoya Ozaki)
24e3d87431 [PokemonSoundLibrary] Add extractor (#3001)
Authored by: Lesmiscore
2022-03-10 22:24:50 +09:00
Ha Tien Loi
63b2f88bc7 [Zingmp3] Fix signature (#3004)
Authored by: hatienl0i261299
2022-03-09 22:13:19 -08:00
pukkandan
07ff290dce Fix --sleep-interval
Bug in d1b5f70bc9

Closes #3012
2022-03-10 11:38:34 +05:30
pukkandan
51c22ef4e2 Fix --throttled-rate
Typo in d1b5f70bc9

Closes #2996
2022-03-10 03:29:01 +05:30
Ha Tien Loi
33b8c411bc [MangoTV] Improve extractor (#2971)
Authored by: hatienl0i261299
2022-03-09 13:54:26 -08:00
MMM
10331a2672 Fix --print with --ignore-no-formats when url is None (#3000)
Authored by: flashdagger
2022-03-09 13:12:23 -08:00
Lesmiscore (Naoya Ozaki)
6e6beffd04 [openrec] Refactor extractors (#2941)
Authored by: Lesmiscore
2022-03-09 21:08:09 +09:00
pukkandan
e491d06d34 [utils] ExtractorError: Fix for older python versions
Closes #2993
2022-03-09 06:42:25 +05:30
pukkandan
7a0ba75857 [build] Add requirements.txt to pip distributions
Closes #2995
2022-03-09 06:42:24 +05:30
coletdev
e248be3319 [panopto] Add extractors (#2908)
Based on https://github.com/ytdl-org/youtube-dl/pull/13449
Closes #1946
Authored by: coletdjnz, kmark
2022-03-08 13:00:57 -08:00
pukkandan
ff91cf7483 [utils] Add get_first 2022-03-09 02:26:52 +05:30
github-actions
a3b7dff015 [version] update
Created by: pukkandan

:ci skip all
2022-03-08 20:23:28 +00:00
pukkandan
c0c2c57d35 Release 2022.03.08.1 2022-03-09 01:52:16 +05:30
pukkandan
aee6ce5867 [build] Fix bug in 08d30158ec 2022-03-09 01:39:47 +05:30
pukkandan
d1b5f70bc9 [cleanup] Refactor __init__.py (#2570)
* Split `__init__` code into multiple functions
* Clean up validation code by grouping similar types of options
* Expose `parse_options` to third parties
2022-03-08 12:03:31 -08:00
github-actions
1eae7f94c1 [version] update
Created by: pukkandan

:ci skip all
2022-03-09 01:31:39 +05:30
pukkandan
535eb16a44 Release 2022.03.08 2022-03-09 01:24:27 +05:30
P-reducible
9461cb586a [Rokfin] Fix availability (#1534)
Authored by: P-reducible
2022-03-08 11:42:00 -08:00
pukkandan
a405b38f20 [youtube] Further de-prioritize 3gp format 2022-03-08 23:02:38 +05:30
pukkandan
08d30158ec [cleanup, docs] Misc cleanup
Closes #2828, closes #2734, closes #2802, closes #2937
2022-03-08 22:38:06 +05:30
Ha Tien Loi
c89bec262c [xinpianchang] Add extractor (#2963)
Authored by: hatienl0i261299
2022-03-08 08:55:40 -08:00
Ha Tien Loi
151f8f1c02 [fptplay] Add extractor (#2949)
Closes #2857
Authored by: hatienl0i261299
2022-03-08 08:52:51 -08:00
Max Mehl
a35155be17 [peertube] Add media.fsfe.org (#2986)
Authored by: mxmehl
2022-03-08 08:48:35 -08:00
nyuszika7h
e66662b1e0 [ccma] Fix timestamp parsing (#2989)
Authored by: nyuszika7h
2022-03-08 08:45:23 -08:00
coletdev
4390d5ec12 Add brotli content-encoding support (#2433)
Authored by: coletdjnz
2022-03-08 08:44:05 -08:00
CplPwnies
9e0e6adb2d [adobepass] Add Suddenlink MSO (#2977)
Closes #2704
Authored by: CplPwnies
2022-03-08 08:18:52 -08:00
Lesmiscore
b637c4e22e [mildom] Fix linter 2022-03-08 23:56:30 +09:00
Lesmiscore (Naoya Ozaki)
fb6e3f4389 [mildom] Rework extractors (#2940)
Authored by: Lesmiscore
2022-03-08 23:49:10 +09:00
pukkandan
409cdd1ec9 [ard] Fix valid URL
Partial fix for #2975
2022-03-08 12:58:26 +05:30
coletdev
992f9a730b [youtube] Prefer UTC upload date for videos (#2223)
Except for live/scheduled streams/premieres. 
Closes #1881
Related: #2402 
Authored-by: coletdjnz
2022-03-08 12:58:19 +05:30
pukkandan
497d2fab6c [utils] Better traceback for ExtractorError 2022-03-08 12:04:49 +05:30
pukkandan
2807d1709b [nrk] Add fallback API
Closes #1891
2022-03-08 11:13:25 +05:30
shirt
b46ccbc6d4 [build] Update pyinstaller to 4.10 2022-03-07 23:02:27 -05:00
Lesmiscore
1ed7953a74 [utils] render_table: Fix character calculation for removing extra gap
without this fix, the column next to delimiter will lack leading spaces on terminal (see https://github.com/yt-dlp/yt-dlp/pull/920#issuecomment-1059914615 for the situation)
2022-03-06 17:11:10 +09:00
pukkandan
d49669acad [youtube] Fix automatic captions
Closes #2956
2022-03-05 09:42:12 +05:30
foghawk
bed30106f5 [tumblr] Fix extractor (#2883)
Authored by: foghawk
2022-03-04 19:24:49 -08:00
Zenon Mousmoulas
27231526ae [ant1newsgr] Add extractor (#1982)
Authored by: zmousm
2022-03-04 13:52:48 -08:00
pukkandan
50e93e03a7 Update to ytdl-commit-6508688
Make default upload_/release_date a compat_str
6508688e88

Except:
* "[NDR] Overhaul NDR and NJoy extractors" https://github.com/ytdl-org/youtube-dl/pull/30531
    - 01824d275b
    - 39a98b09a2
    - f0a05a55c2
    - 4186e81777
2022-03-05 02:24:17 +05:30
FestplattenSchnitzel
72e995f122 [VideocampusSachsen] Add extractors (#2787)
Authored by: FestplattenSchnitzel
2022-03-04 08:19:07 -08:00
pukkandan
8b7539d27c Implement --add-header without modifying std_headers
Closes #2526, #1614
2022-03-04 20:59:03 +05:30
pukkandan
e48b3875ec Revert 2e4cacd038
Closes #2923
2022-03-04 20:18:14 +05:30
pukkandan
2a938746f3 Fix verbose log when stdout/stderr encoding is None
See: 5c10453827
2022-03-04 19:49:39 +05:30
pukkandan
933dbf5a55 [bandcamp] Detect acodec 2022-03-04 19:49:38 +05:30
pukkandan
a10aa588b0 [FormatSort] Consider acodec=ogg as vorbis 2022-03-04 19:49:38 +05:30
pukkandan
be8cd3cb1d [twitch] Fix field name of view_count 2022-03-04 19:49:37 +05:30
pukkandan
319b6059d2 Better error message when no --live-from-start format 2022-03-04 19:49:36 +05:30
pukkandan
4c3f8c3fb6 Handle negative duration from extractor
Closes #2921
2022-03-04 19:49:36 +05:30
pukkandan
7265a2190c Fix doubling of video_id in ExtractorError 2022-03-04 19:37:43 +05:30
pukkandan
3a4bb9f751 [lbry] Fix --ignore-no-formats-error
Closes #2942
2022-03-04 19:24:52 +05:30
i6t
b90dbe6c19 [Gettr] Improve extractor (#2920)
Authored by: i6t
2022-03-04 05:53:43 -08:00
Jeff Huffman
97bef011ee [crunchyroll] Better error reporting on login failure (#2938)
Authored by: tejing1
2022-03-04 03:27:35 -08:00
Ha Tien Loi
ecca4519b7 [zingmp3] Fix extractor (#2889)
Authored by: hatienl0i261299
2022-03-04 03:22:45 -08:00
Ha Tien Loi
761fba6d22 [daystar] Add extractor (#2924)
Closes #2887
Authored by; hatienl0i261299
2022-03-04 03:19:57 -08:00
Ha Tien Loi
5bcccbfec3 [telegram] Add extractor (#2922)
Closes #2910

Authored by: hatienl0i261299
2022-03-04 03:18:46 -08:00
coletdev
ded9f32667 [extractor] Support --mark-watched without _NETRC_MACHINE (#2939)
Authored by: coletdjnz
2022-03-03 23:27:09 -08:00
Emanuel Hoogeveen
45806d44a7 [downloader] Obey --file-access-retries when deleting/renaming (#2224)
Authored by: ehoogeveen-medweb
2022-03-03 06:33:32 -08:00
pukkandan
747c0bd127 [utils] Improve file locking
* Implement non-blocking locks for windows
* Don't raise error when closing a closed file
2022-03-03 19:28:47 +05:30
Justin Keogh
acea8d7cfb [utils] Fix file locking for AOSP (#2714)
Closes #2080, #2670

Authored by: jakeogh
2022-03-03 05:09:00 -08:00
pukkandan
f1d130902b [utils] OnDemandPagedList: Do not download pages after error 2022-03-03 02:43:16 +05:30
pukkandan
c2ae48dbd5 [manyvids] Extract uploader (#2913)
Authored by: regarten
2022-03-03 01:21:05 +05:30
pukkandan
a5c0c20252 [cleanup] Don't pass protocol to _extract_m3u8_formats for live videos
`live` parameter already handles changing the protocol
2022-03-02 22:59:48 +05:30
Lesmiscore (Naoya Ozaki)
f494ddada8 [niconico] Add playlist extractors and refactor (#2915)
Authored by: Lesmiscore
2022-03-03 02:08:13 +09:00
Lesmiscore (Naoya Ozaki)
02fc6feb6e [mirrativ] Cleanup extractor code (#2925)
Authored by: Lesmiscore
2022-03-03 02:06:34 +09:00
pukkandan
7eaf7f9aba [rokfin] Add stack and channel extractors (#1534)
Authored by: P-reducible, pukkandan
2022-03-02 21:39:08 +05:30
pukkandan
334b1c4800 [rokfin] Add extractor (#1534)
Authored by: P-reducible, pukkandan
2022-03-02 19:27:34 +05:30
coletdev
7c219ea601 [youtube:tab] Follow redirect to regional channel (#2884)
Closes #2694
Authored by: coletdjnz
2022-02-28 21:08:19 -08:00
Lesmiscore (Naoya Ozaki)
93c8410d33 [downloader/fragment] Fix bugs around resuming with Range (#2901)
Authored by: Lesmiscore
2022-02-28 13:10:54 +09:00
Lesmiscore
195c22840c [downloader/fragment] Ignore FileNotFoundError when downloading livestreams
when `--live-from-start` is used for YouTube and the live ends, request for the last segment prematurely ends (or 404, 403).
this is causing lack of the file and `FileNotFoundError`
lacking segment doesn't have any data, so it's safe to ignore
2022-02-26 12:34:36 +09:00
Lesmiscore
f0734e1190 [downloader/fragment] Fix problem where multiple threads can share one iterator
which causes "ValueError: generator already executing" error

Closes #2881
2022-02-25 13:22:17 +09:00
Lesmiscore (Naoya Ozaki)
15dfb3929c [fc2:live] Add extractor (#2418)
Authored by: Lesmiscore
2022-02-25 11:16:23 +09:00
Lesmiscore (Naoya Ozaki)
3e9b66d761 [AbemaTV] Add extractors (#1688)
Authored by: Lesmiscore
2022-02-25 11:14:04 +09:00
Lesmiscore (Naoya Ozaki)
a539f06570 [downloader/fragment] Improve --live-from-start for YouTube livestreams (#2870) 2022-02-25 02:00:46 +09:00
pukkandan
b440e1bb22 [devscripts] Improve prepare_manpage
Closes #2873
2022-02-24 17:02:52 +05:30
Lesmiscore (Naoya Ozaki)
03f830040a [YoutubeDL] Fill more fields for playlists (#2824) 2022-02-24 18:42:53 +09:00
pukkandan
09b49e1f68 Add pre-processor stage after_filter
* Move `_match_entry` and `post_extract` to `process_video_result`. It is also left in `process_info` for API compat
* `--list-...` options and `--force-write-archive` now obey filtering options
* Move `SponsorBlockPP` to `after_filter`. Closes https://github.com/yt-dlp/yt-dlp/issues/2536
* Reverts 4ec82a72bb since this commit addresses the issue it was solving
2022-02-23 04:26:48 +05:30
pukkandan
1108613f02 [youtube:tab] Reject webpage data if redirected to home page
Closes #2660
2022-02-23 04:25:55 +05:30
pukkandan
a30a6ed3e4 [youtube:tab] Add approximate_date extractor-arg 2022-02-23 04:25:55 +05:30
pukkandan
65d151d58f [spiegel] Fix _VALID_URL
Closes #2842
2022-02-23 04:25:48 +05:30
pukkandan
72073451be [ThumbnailsConvertor] Support webp
Closes #2226
2022-02-23 03:51:13 +05:30
Lesmiscore (Naoya Ozaki)
77cc7c6e60 [nhk] Add support for NHK for School (#2850)
Authored by: Lesmiscore
2022-02-23 01:15:08 +09:00
i6t
971c4847d7 [Gettr] Fix formats order (#2832)
Closes #2557

Authored by: i6t
2022-02-22 06:24:36 -08:00
Nil Admirari
7a34b5d628 [SponsorBlock] Fixes for highlight and "full video labels" (#2849)
Authored by: nihil-admirari
2022-02-22 06:18:44 -08:00
Aniruddh Joshi
4d4f9a029f [zee5] Support web-series (#2827)
Authored by: Aniruddh-J
2022-02-21 00:07:36 -08:00
Lesmiscore (Naoya Ozaki)
f099df1463 [TwitCasting] Check for password protection (#2838)
Authored by: Lesmiscore
2022-02-20 20:48:26 +09:00
pukkandan
3f4faff748 [generic] Pass referer to extracted formats
Closes #2839
2022-02-20 17:14:31 +05:30
Daniel.Zeng
be8d623455 [Bilibili] Pass referer for all formats (#2834)
Authored by: blackgear
2022-02-20 03:27:02 -08:00
Lesmiscore
a7d4acc018 [youtube] Escape possible $ in _extract_n_function_name regex 2022-02-20 17:33:58 +09:00
Bepis
febff4c119 [tubitv] Fix/improve TV series extraction (#2829)
Authored by: bbepis
2022-02-19 04:00:51 -08:00
pukkandan
ed66a17ef0 [FFmpegConcat] Abort on --simulate 2022-02-18 23:17:37 +05:30
Bricio
5625e6073f [Biqle] Fix extractor (#2731)
Closes #193
Authored by: Bricio
2022-02-18 08:02:14 -08:00
pukkandan
0ad92dfb18 [youtube] De-prioritize potentially damaged formats
Closes #2823
2022-02-18 19:41:37 +05:30
pukkandan
60f3e99592 Tolerate failure to --write-link due to unknown URL
Closes #2724
2022-02-18 18:14:50 +05:30
pukkandan
8d93e69d67 Create necessary directories for --print-to-file
Closes #2721
2022-02-18 18:03:21 +05:30
pukkandan
3aa915400d Fix -all for --sub-langs
Closes #2703
2022-02-18 18:03:20 +05:30
pukkandan
dcd55f766d [aria2c] Add --http-accept-gzip=true
Closes #1936, #1236
2022-02-18 18:03:20 +05:30
pukkandan
2e4cacd038 [youtube] Fix intermittent failure of embed-based age-gate bypass 2022-02-18 18:03:13 +05:30
Ronnnny
c15c316b21 [abc] Support 1080p (#2819)
Authored by: Ronnnny
2022-02-18 00:25:47 -08:00
Bricio
549cb2a836 [rtvs] Fix extractor (#2795)
Closes #2758

Authored by: Bricio
2022-02-18 00:15:17 -08:00
MinePlayersPE
c571b3a6ab [youtube] Fix n-sig extraction for phone player JS (#2816)
Authored by: MinePlayersPE
2022-02-18 00:12:20 -08:00
Bricio
5b804e3906 [washingtonpost] Fix extractor (#2796)
Closes #2778
Authored by: Bricio
2022-02-17 09:38:58 -08:00
Lesmiscore (Naoya Ozaki)
6bb608d055 [piapro] Add extractor (#2801)
Based on https://github.com/ytdl-org/youtube-dl/pull/25922
Closes #2710, https://github.com/ytdl-org/youtube-dl/issues/5856

Authored by: pycabbage, Lesmiscore
2022-02-17 09:15:29 -08:00
Nil Admirari
ae419aa94f [Sponsorblock] minor fixes (#2793)
* preserve mtime - Closes #2769
* keep concat spec on failure

Authored by: nihil-admirari
2022-02-17 09:10:34 -08:00
ajj8
ac184ab742 [bbc] Fix extraction of news articles (#2811)
Closes #1374

Authored by: ajj8
2022-02-17 07:54:53 -08:00
pukkandan
5c10453827 Fix for when stdout/stderr encoding is None
Closes #2711
2022-02-17 19:21:59 +05:30
pukkandan
ffa89477ea [extractor] Fix for manifests without period duration
Closes #2705
Authored by: dirkf, pukkandan
2022-02-17 19:07:23 +05:30
zenerdi0de
db74de8c54 [dropbox] fix regex (#2814)
Closes #2812
Authored by: zenerdi0de
2022-02-17 04:20:47 -08:00
Grabien
edecb5f81f [extractor/cspan] Support of C-Span congress videos (#2295)
Authored by: Grabien
2022-02-16 11:21:05 -08:00
lyz-code
85a0ad0117 [bandcamp] Fix user URLs (#2800)
Authored by: lyz-code
2022-02-16 07:56:17 -08:00
Lesmiscore (Naoya Ozaki)
07ea0014ae [twitcasting] Add fallback for finding running live (#2803)
Authored by: Lesmiscore
2022-02-16 20:32:14 +09:00
schn0sch
e1f7f235bd [peekvids] Use JSON-LD (#2784)
Authored by: schn0sch
2022-02-16 01:32:24 -08:00
shirt
fc259cc249 [build] Update pyinstaller to 4.9 2022-02-15 17:48:02 -05:00
Lesmiscore (Naoya Ozaki)
9a5b012575 [niconico:tag] Add support for searching tags (#2789) 2022-02-16 02:12:39 +09:00
Lesmiscore (Naoya Ozaki)
df635a09a4 [twitcasting] Fix extraction (#2788)
Authored by: Lesmiscore
2022-02-15 23:30:11 +09:00
cyberfox1691
812283199a [murrtube] Add extractor (#2387)
Authored by: cyberfox1691
2022-02-15 03:10:16 -08:00
marieell
5c6dfc1f79 [ATVAt] Detect geo-restriction (#2777)
Authored by: marieell
2022-02-15 01:16:49 -08:00
schn0sch
c2a8547fdc [peekvids] Add extractor (#2414)
Authored by: schn0sch
2022-02-14 19:21:27 -08:00
Bricio
0a19532ead [Caltrans] Add extractor (#2781)
Closes #2775

Authored by: Bricio
2022-02-14 18:45:36 -08:00
Ronald Ip
2d41e2eceb [twitter] Fix for private videos (#2772)
Closes #2762, https://github.com/ytdl-org/youtube-dl/issues/27643
Authored by: iphoting
2022-02-14 08:37:21 -08:00
Lesmiscore (Naoya Ozaki)
81c5f44c0f [fc2] Fix extraction (#2776)
Closes #2774

Authored by: Lesmiscore
2022-02-15 01:35:20 +09:00
Michael Pauley
1f7db8533a [cookies] Update MacOS12 Cookies.binarycookies location (#2742)
Authored by: mdpauley
2022-02-14 06:36:51 -08:00
pukkandan
e8969bda94 Obey --abort-on-error for "ffmpeg not installed"
Closes #1840
2022-02-14 14:40:19 +05:30
chris
c82f051dbb [ruv.is] Add extractor (#2665)
Closes: #2122

Authored by: iw0nderhow
2022-02-13 13:40:50 -08:00
pukkandan
49895f062e [tiktok] Fix vt.tiktok URLs
and add test
2022-02-14 03:06:51 +05:30
coletdev
60f393e48b [youtube] Ensure subtitle urls are absolute (#2765)
Closes #2755

Authored by: coletdjnz
2022-02-13 13:36:01 -08:00
pukkandan
88afe05695 [tiktok] Fix vm.tiktok URLs
Closes #2396
2022-02-13 21:15:59 +05:30
pukkandan
57ebfca39b Set webpage_url_... from webpage_url and not input URL
Closes #2756
2022-02-13 21:15:50 +05:30
YuenSzeHong
b1cb0525ac [fujitv] Extract resolution for free sources (#2685)
Authored by: YuenSzeHong
2022-02-13 06:39:01 -08:00
Lesmiscore (Naoya Ozaki)
da42679b87 [utils] WebSockets wrapper for non-async functions (#2417)
Authored by: Lesmiscore
2022-02-13 14:58:21 +09:00
Lesmiscore
2944835080 [bigo] Fix extractor to not to use form_params 2022-02-13 00:01:04 +09:00
Tom
a3eb987e0e [zoom] Add support for screen cast (#2699)
Authored by: Mipsters
2022-02-12 06:22:51 -08:00
Lesmiscore (Naoya Ozaki)
7bc33ad0e9 [bigo] Add extractor (#2749)
Fixes https://github.com/ytdl-org/youtube-dl/issues/18357

Authored by: Lesmiscore
2022-02-12 06:07:10 -08:00
Bricio
2068a60318 [generic] Set rss guid as video id (#2741)
Closes #2424
Authored by: Bricio
2022-02-11 15:32:58 -08:00
Lukas Fink
1ce9a3cb49 Add regex operator and quoting to format filters (#2698)
Closes #2681 
Authored by: lukasfink1
2022-02-11 13:35:34 -08:00
pukkandan
d49f8db39f [utils] Validate DateRange input
Closes #2641
2022-02-12 02:46:05 +05:30
pukkandan
ab6df717d1 [youtube] Differentiate descriptive audio by language code
Related: #2677
2022-02-12 02:13:17 +05:30
pukkandan
0c8d9e5fec [youtube] Label original auto-subs
Closes #2655
2022-02-12 01:50:49 +05:30
Felix S
3f047fc406 [extractor] Extract subtitles from manifests for more sites (#2686)
vimeo, globo, kaltura, svt

Authored by: fstirlitz
2022-02-11 11:03:33 -08:00
i6t
82b5176783 [Gettr] Add GettrStreamingIE (#2661)
Closes #2654
Authored by: i6t
2022-02-11 10:36:16 -08:00
Bricio
17b183886f [globo] Expand valid URL (#2732)
Closes #2730 
Authored by: Bricio
2022-02-11 10:08:55 -08:00
Bricio
cd170e8184 [beeg] Fix extractor (#2616)
Closes #2592

Authored by: Bricio
2022-02-11 10:05:23 -08:00
pukkandan
297e9952b6 [extractor] Allow http_headers to be specified for thumbnails 2022-02-11 23:31:12 +05:30
marieell
dca4f46274 [cleanup] Remove extractors for some dead websites (#2739)
90tv.ir, HornBunny.com, 220.ro, 5min.com, Kankan.com, Roxwel.com,
FreshLive.tv, TheScene.com, Vube.com

Authored by: marieell
2022-02-11 09:46:29 -08:00
Luc Ritchie
5dee3ad037 [afreecatv] Support password-protected livestreams (#2738)
Authored by: wlritchi
2022-02-11 06:15:59 -08:00
pukkandan
079a7cfc71 [downloader] Do not use aria2c for non-native m3u8
Closes #2718
2022-02-11 12:09:03 +05:30
pukkandan
3856407a86 [options] Rename --clean-infojson to --clean-info-json 2022-02-11 12:07:10 +05:30
pukkandan
db2e129ca0 [options] Better ambiguous option resolution
Eg: `--write-auto` no longer results in
> ambiguous option: --write-auto (--write-auto-subs, --write-automatic-subs?)
2022-02-11 12:07:03 +05:30
marieell
1209b6ca5b [zaq1] Remove dead extractor (#2728)
Was already partially removed in 29f7c58aaf
Authored-by: marieell
2022-02-11 02:15:38 +00:00
Justin Keogh
a3125791c7 [utils] Use locked_file for sanitize_open (#1066)
Authored by: jakeogh
2022-02-05 16:15:51 +05:30
ofkz
f1657a98cb [nfb] Add extractor (#2579)
Authored by: ofkz
2022-02-05 02:52:30 +05:30
285 changed files with 10950 additions and 5088 deletions

8
.editorconfig Normal file
View File

@@ -0,0 +1,8 @@
root = true
[**.py]
charset = utf-8
indent_size = 4
indent_style = space
trim_trailing_whitespace = true
insert_final_newline = true

2
.gitattributes vendored
View File

@@ -2,3 +2,5 @@
Makefile* text whitespace=-tab-in-indent
*.sh text eol=lf
*.md diff=markdown
*.py diff=python

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2022.02.04**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **2022.04.08** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true
@@ -51,12 +51,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.02.04 (exe)
[debug] yt-dlp version 2022.04.08 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.02.04)
yt-dlp is up to date (2022.04.08)
<more lines>
render: shell
validations:

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2022.02.04**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **2022.04.08** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true
@@ -62,12 +62,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.02.04 (exe)
[debug] yt-dlp version 2022.04.08 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.02.04)
yt-dlp is up to date (2022.04.08)
<more lines>
render: shell
validations:

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a site feature request
required: true
- label: I've verified that I'm running yt-dlp version **2022.02.04**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **2022.04.08** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true
@@ -60,12 +60,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.02.04 (exe)
[debug] yt-dlp version 2022.04.08 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.02.04)
yt-dlp is up to date (2022.04.08)
<more lines>
render: shell
validations:

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2022.02.04**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **2022.04.08** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true
@@ -45,12 +45,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.02.04 (exe)
[debug] yt-dlp version 2022.04.08 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.02.04)
yt-dlp is up to date (2022.04.08)
<more lines>
render: shell
validations:

View File

@@ -13,7 +13,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.02.04**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **2022.04.08** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues including closed ones. DO NOT post duplicates
required: true
@@ -30,3 +30,24 @@ body:
placeholder: WRITE DESCRIPTION HERE
validations:
required: true
- type: textarea
id: log
attributes:
label: Verbose log
description: |
If your feature request involves an existing yt-dlp command, provide the complete verbose output of that command.
Add the `-vU` flag to **your** command line you run yt-dlp with (`yt-dlp -vU <your command line>`), copy the WHOLE output and insert it below.
It should look similar to this:
placeholder: |
[debug] Command-line config: ['-vU', 'http://www.youtube.com/watch?v=BaW_jenozKc']
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2021.12.01 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2021.12.01)
<more lines>
render: shell

View File

@@ -35,7 +35,7 @@ body:
attributes:
label: Verbose log
description: |
If your question involes a yt-dlp command, provide the complete verbose output of that command.
If your question involves a yt-dlp command, provide the complete verbose output of that command.
Add the `-vU` flag to **your** command line you run yt-dlp with (`yt-dlp -vU <your command line>`), copy the WHOLE output and insert it below.
It should look similar to this:
placeholder: |

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a site feature request
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true

View File

@@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are alive and playable in a browser
required: true

View File

@@ -13,7 +13,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues including closed ones. DO NOT post duplicates
required: true
@@ -30,3 +30,24 @@ body:
placeholder: WRITE DESCRIPTION HERE
validations:
required: true
- type: textarea
id: log
attributes:
label: Verbose log
description: |
If your feature request involves an existing yt-dlp command, provide the complete verbose output of that command.
Add the `-vU` flag to **your** command line you run yt-dlp with (`yt-dlp -vU <your command line>`), copy the WHOLE output and insert it below.
It should look similar to this:
placeholder: |
[debug] Command-line config: ['-vU', 'http://www.youtube.com/watch?v=BaW_jenozKc']
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2021.12.01 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2021.12.01)
<more lines>
render: shell

View File

@@ -35,7 +35,7 @@ body:
attributes:
label: Verbose log
description: |
If your question involes a yt-dlp command, provide the complete verbose output of that command.
If your question involves a yt-dlp command, provide the complete verbose output of that command.
Add the `-vU` flag to **your** command line you run yt-dlp with (`yt-dlp -vU <your command line>`), copy the WHOLE output and insert it below.
It should look similar to this:
placeholder: |

View File

@@ -161,11 +161,10 @@ jobs:
steps:
- uses: actions/checkout@v2
# In order to create a universal2 application, the version of python3 in /usr/bin has to be used
# Pyinstaller is pinned to 4.5.1 because the builds are failing in 4.6, 4.7
- name: Install Requirements
run: |
brew install coreutils
/usr/bin/python3 -m pip install -U --user pip Pyinstaller==4.5.1 -r requirements.txt
/usr/bin/python3 -m pip install -U --user pip Pyinstaller==4.10 -r requirements.txt
- name: Bump version
id: bump_version
run: /usr/bin/python3 devscripts/update-version.py
@@ -234,7 +233,7 @@ jobs:
# Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
run: |
python -m pip install --upgrade pip setuptools wheel py2exe
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-4.5.1-py3-none-any.whl" -r requirements.txt
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-4.10-py3-none-any.whl" -r requirements.txt
- name: Bump version
id: bump_version
env:
@@ -321,7 +320,7 @@ jobs:
- name: Install Requirements
run: |
python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-4.5.1-py3-none-any.whl" -r requirements.txt
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-4.10-py3-none-any.whl" -r requirements.txt
- name: Bump version
id: bump_version
env:

5
.gitignore vendored
View File

@@ -24,6 +24,7 @@ cookies
*.3gp
*.ape
*.ass
*.avi
*.desktop
*.flac
@@ -106,6 +107,7 @@ yt-dlp.zip
*.iml
.vscode
*.sublime-*
*.code-workspace
# Lazy extractors
*/extractor/lazy_extractors.py
@@ -114,3 +116,6 @@ yt-dlp.zip
ytdlp_plugins/extractor/*
!ytdlp_plugins/extractor/__init__.py
!ytdlp_plugins/extractor/sample.py
ytdlp_plugins/postprocessor/*
!ytdlp_plugins/postprocessor/__init__.py
!ytdlp_plugins/postprocessor/sample.py

View File

@@ -1,22 +0,0 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
# Optionally build your docs in additional formats such as PDF
formats:
- epub
- pdf
- htmlzip
# Optionally set the version of Python and requirements required to build your docs
python:
version: 3
install:
- requirements: docs/requirements.txt

View File

@@ -11,6 +11,7 @@
- [Is anyone going to need the feature?](#is-anyone-going-to-need-the-feature)
- [Is your question about yt-dlp?](#is-your-question-about-yt-dlp)
- [Are you willing to share account details if needed?](#are-you-willing-to-share-account-details-if-needed)
- [Is the website primarily used for piracy](#is-the-website-primarily-used-for-piracy)
- [DEVELOPER INSTRUCTIONS](#developer-instructions)
- [Adding new feature or making overarching changes](#adding-new-feature-or-making-overarching-changes)
- [Adding support for a new site](#adding-support-for-a-new-site)
@@ -24,6 +25,7 @@
- [Collapse fallbacks](#collapse-fallbacks)
- [Trailing parentheses](#trailing-parentheses)
- [Use convenience conversion and parsing functions](#use-convenience-conversion-and-parsing-functions)
- [My pull request is labeled pending-fixes](#my-pull-request-is-labeled-pending-fixes)
- [EMBEDDING YT-DLP](README.md#embedding-yt-dlp)
@@ -123,6 +125,10 @@ While these steps won't necessarily ensure that no misuse of the account takes p
- Change the password before sharing the account to something random (use [this](https://passwordsgenerator.net/) if you don't have a random password generator).
- Change the password after receiving the account back.
### Is the website primarily used for piracy?
We follow [youtube-dl's policy](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) to not support services that is primarily used for infringing copyright. Additionally, it has been decided to not to support porn sites that specialize in deep fake. We also cannot support any service that serves only [DRM protected content](https://en.wikipedia.org/wiki/Digital_rights_management).
@@ -210,7 +216,7 @@ After you have ensured this site is distributing its content legally, you can fo
}
```
1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py).
1. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
@@ -528,13 +534,13 @@ Extracting variables is acceptable for reducing code duplication and improving r
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
title = self._html_search_regex(r'<h1>([^<]+)</h1>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
TITLE_RE = r'<h1>([^<]+)</h1>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
@@ -637,7 +643,7 @@ Wrap all extracted numeric data into safe functions from [`yt_dlp/utils.py`](yt_
Use `url_or_none` for safe URL processing.
Use `try_get`, `dict_get` and `traverse_obj` for safe metadata extraction from parsed JSON.
Use `traverse_obj` and `try_call` (superseeds `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.
Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.
@@ -658,6 +664,10 @@ duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views'))
```
# My pull request is labeled pending-fixes
The `pending-fixes` label is added when there are changes requested to a PR. When the necessary changes are made, the label should be removed. However, despite our best efforts, it may sometimes happen that the maintainer did not see the changes or forgot to remove the label. If your PR is still marked as `pending-fixes` a few days after all requested changes have been made, feel free to ping the maintainer who labeled your issue and ask them to re-review and remove the label.

View File

@@ -146,7 +146,7 @@ chio0hai
cntrl-s
Deer-Spangle
DEvmIb
Grabien
Grabien/MaximVol
j54vc1bk
mpeter50
mrpapersonic
@@ -160,7 +160,7 @@ PilzAdam
zmousm
iw0nderhow
unit193
TwoThousandHedgehogs
TwoThousandHedgehogs/KathrynElrod
Jertzukka
cypheron
Hyeeji
@@ -194,3 +194,40 @@ KiberInfinity
tejing1
Bricio
lazypete365
Aniruddh-J
blackgear
CplPwnies
cyberfox1691
FestplattenSchnitzel
hatienl0i261299
iphoting
jakeogh
lukasfink1
lyz-code
marieell
mdpauley
Mipsters
mxmehl
ofkz
P-reducible
pycabbage
regarten
Ronnnny
schn0sch
s0u1h
MrRawes
cffswb
danielyli
1-Byte
mehq
dzek69
aaearon
panatexxa
kmark
un-def
goggle
Soebb
Fam0r
bohwaz
dodrian
vvto33

View File

@@ -11,6 +11,277 @@
-->
### 2022.04.08
* Use certificates from `certifi` if installed by [coletdjnz](https://github.com/coletdjnz)
* Treat multiple `--match-filters` as OR
* File locking improvevemnts:
* Do not lock downloading file on Windows
* Do not prevent download if locking is unsupported
* Do not truncate files before locking by [jakeogh](https://github.com/jakeogh), [pukkandan](https://github.com/pukkandan)
* Fix non-blocking non-exclusive lock
* De-prioritize automatic-subtitles when no `--sub-lang` is given
* Exit after `--dump-user-agent`
* Fallback to video-only format when selecting by extension
* Fix `--abort-on-error` for subtitles
* Fix `--no-overwrite` for playlist infojson
* Fix `--print` with `--ignore-no-formats` when url is `None` by [flashdagger](https://github.com/flashdagger)
* Fix `--sleep-interval`
* Fix `--throttled-rate`
* Fix `autonumber`
* Fix case of `http_headers`
* Fix filepath sanitization in `--print-to-file`
* Handle float in `--wait-for-video`
* Ignore `mhtml` formats from `-f mergeall`
* Ignore format-specific fields in initial pass of `--match-filter`
* Protect stdout from unexpected progress and console-title
* Remove `Accept-Encoding` header from `std_headers` by [coletdjnz](https://github.com/coletdjnz)
* Remove incorrect warning for `--dateafter`
* Show warning when all media formats have DRM
* [downloader] Fix invocation of `HttpieFD`
* [http] Fix #3215
* [http] Reject broken range before request by [Lesmiscore](https://github.com/Lesmiscore), [Jules-A](https://github.com/Jules-A), [pukkandan](https://github.com/pukkandan)
* [fragment] Read downloaded fragments only when needed by [Lesmiscore](https://github.com/Lesmiscore)
* [http] Retry on more errors by [coletdjnz](https://github.com/coletdjnz)
* [mhtml] Fix fragments with absolute urls by [coletdjnz](https://github.com/coletdjnz)
* [extractor] Add `_perform_login` function
* [extractor] Allow control characters inside json
* [extractor] Support merging subtitles with data by [coletdjnz](https://github.com/coletdjnz)
* [generic] Extract subtitles from video.js by [Lesmiscore](https://github.com/Lesmiscore)
* [ffmpeg] Cache version data
* [FFmpegConcat] Ensure final directory exists
* [FfmpegMetadata] Write id3v1 tags
* [FFmpegVideoConvertor] Add more formats to `--remux-video`
* [FFmpegVideoConvertor] Ensure all streams are copied
* [MetadataParser] Validate outtmpl early
* [outtmpl] Fix replacement/default when used with alternate
* [outtmpl] Limit changes during sanitization
* [phantomjs] Fix bug
* [test] Add `test_locked_file`
* [utils] `format_decimal_suffix`: Fix for very large numbers by [s0u1h](https://github.com/s0u1h)
* [utils] `traverse_obj`: Allow filtering by value
* [utils] Add `filter_dict`, `get_first`, `try_call`
* [utils] ExtractorError: Fix for older python versions
* [utils] WebSocketsWrapper: Allow omitting `__enter__` invocation by [Lesmiscore](https://github.com/Lesmiscore)
* [docs] Add an `.editorconfig` file by [fstirlitz](https://github.com/fstirlitz)
* [docs] Clarify the exact `BSD` license of dependencies by [MrRawes](https://github.com/MrRawes)
* [docs] Minor improvements by [pukkandan](https://github.com/pukkandan), [cffswb](https://github.com/cffswb), [danielyli](https://github.com/danielyli)
* [docs] Remove readthedocs
* [build] Add `requirements.txt` to pip distributions
* [cleanup, postprocessor] Create `_download_json`
* [cleanup, vimeo] Fix tests
* [cleanup] Misc fixes and minor cleanup
* [cleanup] Use `_html_extract_title`
* [AfreecaTV] Add `AfreecaTVUserIE` by [hatienl0i261299](https://github.com/hatienl0i261299)
* [arte] Add `format_note` to m3u8 formats
* [azmedien] Add TVO Online to supported hosts by [1-Byte](https://github.com/1-Byte)
* [BanBye] Add extractor by [mehq](https://github.com/mehq)
* [bilibili] Fix extraction of title with quotes by [dzek69](https://github.com/dzek69)
* [Craftsy] Add extractor by [Bricio](https://github.com/Bricio)
* [Cybrary] Add extractor by [aaearon](https://github.com/aaearon)
* [Huya] Add extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [ITProTV] Add extractor by [aaearon](https://github.com/aaearon)
* [Jable] Add extractors by [mehq](https://github.com/mehq)
* [LastFM] Add extractors by [mehq](https://github.com/mehq)
* [Moviepilot] Add extractor by [panatexxa](https://github.com/panatexxa)
* [panopto] Add extractors by [coletdjnz](https://github.com/coletdjnz), [kmark](https://github.com/kmark)
* [PokemonSoundLibrary] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [WasdTV] Add extractor by [un-def](https://github.com/un-def), [hatienl0i261299](https://github.com/hatienl0i261299)
* [adobepass] Fix Suddenlink MSO by [CplPwnies](https://github.com/CplPwnies)
* [afreecatv] Match new vod url by [wlritchi](https://github.com/wlritchi)
* [AZMedien] Support `tv.telezueri.ch` by [goggle](https://github.com/goggle)
* [BiliIntl] Support user-generated videos by [wlritchi](https://github.com/wlritchi)
* [BRMediathek] Fix VALID_URL
* [crunchyroll:playlist] Implement beta API by [tejing1](https://github.com/tejing1)
* [crunchyroll] Fix inheritance
* [daftsex] Fix extractor by [Soebb](https://github.com/Soebb)
* [dailymotion] Support `geo.dailymotion.com` by [hatienl0i261299](https://github.com/hatienl0i261299)
* [ellentube] Extract subtitles from manifest
* [elonet] Rewrite extractor by [Fam0r](https://github.com/Fam0r), [pukkandan](https://github.com/pukkandan)
* [fptplay] Fix metadata extraction by [hatienl0i261299](https://github.com/hatienl0i261299)
* [FranceCulture] Support playlists by [bohwaz](https://github.com/bohwaz)
* [go, viu] Extract subtitles from the m3u8 manifest by [fstirlitz](https://github.com/fstirlitz)
* [Imdb] Improve extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [MangoTV] Improve extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [Nebula] Fix bug in 52efa4b31200119adaa8acf33e50b84fcb6948f0
* [niconico] Fix extraction of thumbnails and uploader (#3266)
* [niconico] Rewrite NiconicoIE by [Lesmiscore](https://github.com/Lesmiscore)
* [nitter] Minor fixes and update instance list by [foghawk](https://github.com/foghawk)
* [NRK] Extract timestamp by [hatienl0i261299](https://github.com/hatienl0i261299)
* [openrec] Download archived livestreams by [Lesmiscore](https://github.com/Lesmiscore)
* [openrec] Refactor extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [panopto] Improve subtitle extraction and support slides by [coletdjnz](https://github.com/coletdjnz)
* [ParamountPlus, CBS] Change VALID_URL by [Sipherdrakon](https://github.com/Sipherdrakon)
* [ParamountPlusSeries] Support multiple pages by [dodrian](https://github.com/dodrian)
* [Piapro] Extract description with break lines by [Lesmiscore](https://github.com/Lesmiscore)
* [rai] Fix extraction of http formas by [nixxo](https://github.com/nixxo)
* [rumble] unescape title
* [RUTV] Fix format sorting by [Lesmiscore](https://github.com/Lesmiscore)
* [ruutu] Detect embeds by [tpikonen](https://github.com/tpikonen)
* [tenplay] Improve extractor by [aarubui](https://github.com/aarubui)
* [TikTok] Fix URLs with user id by [hatienl0i261299](https://github.com/hatienl0i261299)
* [TikTokVM] Fix redirect to user URL
* [TVer] Fix extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [TVer] Support landing page by [vvto33](https://github.com/vvto33)
* [twitcasting] Don't return multi_video for archive with single hls manifest by [Lesmiscore](https://github.com/Lesmiscore)
* [veo] Fix `_VALID_URL`
* [Veo] Fix extractor by [i6t](https://github.com/i6t)
* [viki] Don't attempt to modify URLs with signature by [nyuszika7h](https://github.com/nyuszika7h)
* [viu] Fix bypass for preview by [zackmark29](https://github.com/zackmark29)
* [viu] Fixed extractor by [zackmark29](https://github.com/zackmark29), [pukkandan](https://github.com/pukkandan)
* [web.archive:youtube] Make CDX API requests non-fatal by [coletdjnz](https://github.com/coletdjnz)
* [wget] Fix proxy by [kikuyan](https://github.com/kikuyan), [coletdjnz](https://github.com/coletdjnz)
* [xnxx] Add `xnxx3.com` by [rozari0](https://github.com/rozari0)
* [youtube] **Add new age-gate bypass** by [zerodytrash](https://github.com/zerodytrash), [pukkandan](https://github.com/pukkandan)
* [youtube] Add extractor-arg to skip auto-translated subs
* [youtube] Avoid false positives when detecting damaged formats
* [youtube] Detect DRM better by [shirt](https://github.com/shirt-dev)
* [youtube] Fix auto-translated automatic captions
* [youtube] Fix pagination of `membership` tab
* [youtube] Fix uploader for collaborative playlists by [coletdjnz](https://github.com/coletdjnz)
* [youtube] Improve video upload date handling by [coletdjnz](https://github.com/coletdjnz)
* [youtube:api] Prefer minified JSON response by [coletdjnz](https://github.com/coletdjnz)
* [youtube:search] Support hashtag entries by [coletdjnz](https://github.com/coletdjnz)
* [youtube:tab] Fix duration extraction for shorts by [coletdjnz](https://github.com/coletdjnz)
* [youtube:tab] Minor improvements
* [youtube:tab] Return shorts url if video is a short by [coletdjnz](https://github.com/coletdjnz)
* [Zattoo] Fix extractors by [goggle](https://github.com/goggle)
* [Zingmp3] Fix signature by [hatienl0i261299](https://github.com/hatienl0i261299)
### 2022.03.08.1
* [cleanup] Refactor `__init__.py`
* [build] Fix bug
### 2022.03.08
* Merge youtube-dl: Upto [commit/6508688](https://github.com/ytdl-org/youtube-dl/commit/6508688e88c83bb811653083db9351702cd39a6a) (except NDR)
* Add regex operator and quoting to format filters by [lukasfink1](https://github.com/lukasfink1)
* Add brotli content-encoding support by [coletdjnz](https://github.com/coletdjnz)
* Add pre-processor stage `after_filter`
* Better error message when no `--live-from-start` format
* Create necessary directories for `--print-to-file`
* Fill more fields for playlists by [Lesmiscore](https://github.com/Lesmiscore)
* Fix `-all` for `--sub-langs`
* Fix doubling of `video_id` in `ExtractorError`
* Fix for when stdout/stderr encoding is `None`
* Handle negative duration from extractor
* Implement `--add-header` without modifying `std_headers`
* Obey `--abort-on-error` for "ffmpeg not installed"
* Set `webpage_url_...` from `webpage_url` and not input URL
* Tolerate failure to `--write-link` due to unknown URL
* [aria2c] Add `--http-accept-gzip=true`
* [build] Update pyinstaller to 4.10 by [shirt](https://github.com/shirt-dev)
* [cookies] Update MacOS12 `Cookies.binarycookies` location by [mdpauley](https://github.com/mdpauley)
* [devscripts] Improve `prepare_manpage`
* [downloader] Do not use aria2c for non-native `m3u8`
* [downloader] Obey `--file-access-retries` when deleting/renaming by [ehoogeveen-medweb](https://github.com/ehoogeveen-medweb)
* [extractor] Allow `http_headers` to be specified for `thumbnails`
* [extractor] Extract subtitles from manifests for vimeo, globo, kaltura, svt by [fstirlitz](https://github.com/fstirlitz)
* [extractor] Fix for manifests without period duration by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [extractor] Support `--mark-watched` without `_NETRC_MACHINE` by [coletdjnz](https://github.com/coletdjnz)
* [FFmpegConcat] Abort on `--simulate`
* [FormatSort] Consider `acodec`=`ogg` as `vorbis`
* [fragment] Fix bugs around resuming with Range by [Lesmiscore](https://github.com/Lesmiscore)
* [fragment] Improve `--live-from-start` for YouTube livestreams by [Lesmiscore](https://github.com/Lesmiscore)
* [generic] Pass referer to extracted formats
* [generic] Set rss `guid` as video id by [Bricio](https://github.com/Bricio)
* [options] Better ambiguous option resolution
* [options] Rename `--clean-infojson` to `--clean-info-json`
* [SponsorBlock] Fixes for highlight and "full video labels" by [nihil-admirari](https://github.com/nihil-admirari)
* [Sponsorblock] minor fixes by [nihil-admirari](https://github.com/nihil-admirari)
* [utils] Better traceback for `ExtractorError`
* [utils] Fix file locking for AOSP by [jakeogh](https://github.com/jakeogh)
* [utils] Improve file locking
* [utils] OnDemandPagedList: Do not download pages after error
* [utils] render_table: Fix character calculation for removing extra gap by [Lesmiscore](https://github.com/Lesmiscore)
* [utils] Use `locked_file` for `sanitize_open` by [jakeogh](https://github.com/jakeogh)
* [utils] Validate `DateRange` input
* [utils] WebSockets wrapper for non-async functions by [Lesmiscore](https://github.com/Lesmiscore)
* [cleanup] Don't pass protocol to `_extract_m3u8_formats` for live videos
* [cleanup] Remove extractors for some dead websites by [marieell](https://github.com/marieell)
* [cleanup, docs] Misc cleanup
* [AbemaTV] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [adobepass] Add Suddenlink MSO by [CplPwnies](https://github.com/CplPwnies)
* [ant1newsgr] Add extractor by [zmousm](https://github.com/zmousm)
* [bigo] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [Caltrans] Add extractor by [Bricio](https://github.com/Bricio)
* [daystar] Add extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [fc2:live] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [fptplay] Add extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [murrtube] Add extractor by [cyberfox1691](https://github.com/cyberfox1691)
* [nfb] Add extractor by [ofkz](https://github.com/ofkz)
* [niconico] Add playlist extractors and refactor by [Lesmiscore](https://github.com/Lesmiscore)
* [peekvids] Add extractor by [schn0sch](https://github.com/schn0sch)
* [piapro] Add extractor by [pycabbage](https://github.com/pycabbage), [Lesmiscore](https://github.com/Lesmiscore)
* [rokfin] Add extractor by [P-reducible](https://github.com/P-reducible), [pukkandan](https://github.com/pukkandan)
* [rokfin] Add stack and channel extractors by [P-reducible](https://github.com/P-reducible), [pukkandan](https://github.com/pukkandan)
* [ruv.is] Add extractor by [iw0nderhow](https://github.com/iw0nderhow)
* [telegram] Add extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [VideocampusSachsen] Add extractors by [FestplattenSchnitzel](https://github.com/FestplattenSchnitzel)
* [xinpianchang] Add extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [abc] Support 1080p by [Ronnnny](https://github.com/Ronnnny)
* [afreecatv] Support password-protected livestreams by [wlritchi](https://github.com/wlritchi)
* [ard] Fix valid URL
* [ATVAt] Detect geo-restriction by [marieell](https://github.com/marieell)
* [bandcamp] Detect acodec
* [bandcamp] Fix user URLs by [lyz-code](https://github.com/lyz-code)
* [bbc] Fix extraction of news articles by [ajj8](https://github.com/ajj8)
* [beeg] Fix extractor by [Bricio](https://github.com/Bricio)
* [bigo] Fix extractor to not to use `form_params`
* [Bilibili] Pass referer for all formats by [blackgear](https://github.com/blackgear)
* [Biqle] Fix extractor by [Bricio](https://github.com/Bricio)
* [ccma] Fix timestamp parsing by [nyuszika7h](https://github.com/nyuszika7h)
* [crunchyroll] Better error reporting on login failure by [tejing1](https://github.com/tejing1)
* [cspan] Support of C-Span congress videos by [Grabien](https://github.com/Grabien)
* [dropbox] fix regex by [zenerdi0de](https://github.com/zenerdi0de)
* [fc2] Fix extraction by [Lesmiscore](https://github.com/Lesmiscore)
* [fujitv] Extract resolution for free sources by [YuenSzeHong](https://github.com/YuenSzeHong)
* [Gettr] Add `GettrStreamingIE` by [i6t](https://github.com/i6t)
* [Gettr] Fix formats order by [i6t](https://github.com/i6t)
* [Gettr] Improve extractor by [i6t](https://github.com/i6t)
* [globo] Expand valid URL by [Bricio](https://github.com/Bricio)
* [lbry] Fix `--ignore-no-formats-error`
* [manyvids] Extract `uploader` by [regarten](https://github.com/regarten)
* [mildom] Fix linter
* [mildom] Rework extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [mirrativ] Cleanup extractor code by [Lesmiscore](https://github.com/Lesmiscore)
* [nhk] Add support for NHK for School by [Lesmiscore](https://github.com/Lesmiscore)
* [niconico:tag] Add support for searching tags
* [nrk] Add fallback API
* [peekvids] Use JSON-LD by [schn0sch](https://github.com/schn0sch)
* [peertube] Add media.fsfe.org by [mxmehl](https://github.com/mxmehl)
* [rtvs] Fix extractor by [Bricio](https://github.com/Bricio)
* [spiegel] Fix `_VALID_URL`
* [ThumbnailsConvertor] Support `webp`
* [tiktok] Fix `vm.tiktok`/`vt.tiktok` URLs
* [tubitv] Fix/improve TV series extraction by [bbepis](https://github.com/bbepis)
* [tumblr] Fix extractor by [foghawk](https://github.com/foghawk)
* [twitcasting] Add fallback for finding running live by [Lesmiscore](https://github.com/Lesmiscore)
* [TwitCasting] Check for password protection by [Lesmiscore](https://github.com/Lesmiscore)
* [twitcasting] Fix extraction by [Lesmiscore](https://github.com/Lesmiscore)
* [twitch] Fix field name of `view_count`
* [twitter] Fix for private videos by [iphoting](https://github.com/iphoting)
* [washingtonpost] Fix extractor by [Bricio](https://github.com/Bricio)
* [youtube:tab] Add `approximate_date` extractor-arg
* [youtube:tab] Follow redirect to regional channel by [coletdjnz](https://github.com/coletdjnz)
* [youtube:tab] Reject webpage data if redirected to home page
* [youtube] De-prioritize potentially damaged formats
* [youtube] Differentiate descriptive audio by language code
* [youtube] Ensure subtitle urls are absolute by [coletdjnz](https://github.com/coletdjnz)
* [youtube] Escape possible `$` in `_extract_n_function_name` regex by [Lesmiscore](https://github.com/Lesmiscore)
* [youtube] Fix automatic captions
* [youtube] Fix n-sig extraction for phone player JS by [MinePlayersPE](https://github.com/MinePlayersPE)
* [youtube] Further de-prioritize 3gp format
* [youtube] Label original auto-subs
* [youtube] Prefer UTC upload date for videos by [coletdjnz](https://github.com/coletdjnz)
* [zaq1] Remove dead extractor by [marieell](https://github.com/marieell)
* [zee5] Support web-series by [Aniruddh-J](https://github.com/Aniruddh-J)
* [zingmp3] Fix extractor by [hatienl0i261299](https://github.com/hatienl0i261299)
* [zoom] Add support for screen cast by [Mipsters](https://github.com/Mipsters)
### 2022.02.04
* [youtube:search] Fix extractor by [coletdjnz](https://github.com/coletdjnz)

View File

@@ -29,6 +29,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements
* Added support for downloading YoutubeWebArchive videos
* Added support for new websites MainStreaming, PRX, nzherald, etc

View File

@@ -5,5 +5,6 @@ include README.md
include completions/*/*
include supportedsites.md
include yt-dlp.1
include requirements.txt
recursive-include devscripts *
recursive-include test *

View File

@@ -16,7 +16,7 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites com
clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.3gp *.ape *.ass *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \

222
README.md
View File

@@ -3,15 +3,14 @@
[![YT-DLP](https://raw.githubusercontent.com/yt-dlp/yt-dlp/master/.github/banner.svg)](#readme)
[![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=blue&label=Download&style=for-the-badge)](#release-files "Release")
[![License: Unlicense](https://img.shields.io/badge/-Unlicense-brightgreen.svg?style=for-the-badge)](LICENSE "License")
[![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate")
[![Docs](https://img.shields.io/badge/-Docs-blue.svg?color=blue&style=for-the-badge)](https://readthedocs.org/projects/yt-dlp/ "Docs")
[![Supported Sites](https://img.shields.io/badge/-Supported_Sites-brightgreen.svg?style=for-the-badge)](supportedsites.md "Supported Sites")
[![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Download&style=for-the-badge)](#release-files "Release")
[![PyPi](https://img.shields.io/badge/-PyPi-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp "PyPi")
[![CI Status](https://img.shields.io/github/workflow/status/yt-dlp/yt-dlp/Core%20Tests/master?label=Tests&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/actions "CI Status")
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&labelColor=555555&label=&logo=discord&style=for-the-badge)](https://discord.gg/H5MNcFW63r "Discord")
[![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate")
[![Matrix](https://img.shields.io/matrix/yt-dlp:matrix.org?color=brightgreen&labelColor=555555&label=&logo=element&style=for-the-badge)](https://matrix.to/#/#yt-dlp:matrix.org "Matrix")
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&labelColor=555555&label=&logo=discord&style=for-the-badge)](https://discord.gg/H5MNcFW63r "Discord")
[![Supported Sites](https://img.shields.io/badge/-Supported_Sites-brightgreen.svg?style=for-the-badge)](supportedsites.md "Supported Sites")
[![License: Unlicense](https://img.shields.io/badge/-Unlicense-blue.svg?style=for-the-badge)](LICENSE "License")
[![CI Status](https://img.shields.io/github/workflow/status/yt-dlp/yt-dlp/Core%20Tests/master?label=Tests&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/actions "CI Status")
[![Commits](https://img.shields.io/github/commit-activity/m/yt-dlp/yt-dlp?label=commits&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/commits "Commit History")
[![Last Commit](https://img.shields.io/github/last-commit/yt-dlp/yt-dlp/master?label=&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/commits "Commit History")
@@ -71,13 +70,13 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES
* Based on **youtube-dl 2021.12.17 [commit/5add3f4](https://github.com/ytdl-org/youtube-dl/commit/5add3f4373287e6346ca3551239edab549284db3)** and **youtube-dlc 2020.11.11-3 [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* Based on **youtube-dl 2021.12.17 [commit/6508688](https://github.com/ytdl-org/youtube-dl/commit/6508688e88c83bb811653083db9351702cd39a6a)** ([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21)) and **youtube-dlc 2020.11.11-3 [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that the NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Youtube improvements**:
* All Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`) and private playlists supports downloading multiple pages of content
@@ -112,7 +111,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
* **Other new options**: Many new options have been added such as `--concat-playlist`, `--print`, `--wait-for-video`, `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Improvements**: Regex and other operators in `--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Improvements**: Regex and other operators in `--format`/`--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details
@@ -126,11 +125,12 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* The options `--auto-number` (`-A`), `--title` (`-t`) and `--literal` (`-l`), no longer work. See [removed options](#Removed) for details
* `avconv` is not supported as an alternative to `ffmpeg`
* yt-dlp stores config files in slightly different locations to youtube-dl. See [configuration](#configuration) for a list of correct locations
* The default [output template](#output-template) is `%(title)s [%(id)s].%(ext)s`. There is no real reason for this change. This was changed before yt-dlp was ever made public and now there are no plans to change it back to `%(title)s-%(id)s.%(ext)s`. Instead, you may use `--compat-options filename`
* The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
* The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
* Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
* `--ignore-errors` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
* `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
* When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files
* `--add-metadata` attaches the `infojson` to `mkv` files in addition to writing the metadata when used with `--write-info-json`. Use `--no-embed-info-json` or `--compat-options no-attach-info-json` to revert this
* Some metadata are embedded into different fields when using `--add-metadata` as compared to youtube-dl. Most notably, `comment` field contains the `webpage_url` and `synopsis` contains the `description`. You can [use `--parse-metadata`](#modifying-metadata) to modify this to your liking or use `--compat-options embed-metadata` to revert this
@@ -144,6 +144,8 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
* When `--embed-subs` and `--write-subs` are used together, the subtitles are written to disk and also embedded in the media file. You can use just `--embed-subs` to embed the subs and automatically delete the separate file. See [#630 (comment)](https://github.com/yt-dlp/yt-dlp/issues/630#issuecomment-893659460) for more info. `--compat-options no-keep-subs` can be used to revert this
* `certifi` will be used for SSL root certificates, if installed. If you want to use system certificates (e.g. self-signed), use `--compat-options no-certifi`
* youtube-dl tries to remove some superfluous punctuations from filenames. While this can sometimes be helpfull, it is often undesirable. So yt-dlp tries to keep the fields in the filenames as close to their original values as possible. You can use `--compat-options filename-sanitization` to revert to youtube-dl's behavior
For ease of use, a few more compat options are available:
* `--compat-options all`: Use all compat options
@@ -202,7 +204,7 @@ python3 -m pip install --no-deps -U yt-dlp
If you want to be on the cutting edge, you can also install the master branch with:
```
python3 -m pip install --force-reinstall https://github.com/yt-dlp/yt-dlp/archive/master.zip
python3 -m pip install --force-reinstall https://github.com/yt-dlp/yt-dlp/archive/master.tar.gz
```
Note that on some systems, you may need to use `py` or `python` instead of `python3`
@@ -230,14 +232,14 @@ If you [installed using Homebrew](#with-homebrew), run `brew upgrade yt-dlp/taps
File|Description
:---|:---
[yt-dlp](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp)|Platform-independant binary. Needs Python (recommended for **UNIX-like systems**)
[yt-dlp](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp)|Platform-independant binary. Needs Python (recommended for **Linux/BSD**)
[yt-dlp.exe](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.exe)|Windows (Win7 SP1+) standalone x64 binary (recommended for **Windows**)
[yt-dlp_macos](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos)|MacOS (10.15+) standalone executable (recommended for **MacOS**)
#### Alternatives
File|Description
:---|:---
[yt-dlp_macos](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos)|MacOS (10.15+) standalone executable
[yt-dlp_x86.exe](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_x86.exe)|Windows (Vista SP2+) standalone x86 (32-bit) binary
[yt-dlp_min.exe](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_min.exe)|Windows (Win7 SP1+) standalone x64 binary built with `py2exe`.<br/> Does not contain `pycryptodomex`, needs VC++14
[yt-dlp_win.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_win.zip)|Unpackaged Windows executable (no auto-update)
@@ -263,28 +265,31 @@ On windows, [Microsoft Visual C++ 2010 SP1 Redistributable Package (x86)](https:
While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly recommended
* [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html)
* [**mutagen**](https://github.com/quodlibet/mutagen) - For embedding thumbnail in certain formats. Licensed under [GPLv2+](https://github.com/quodlibet/mutagen/blob/master/COPYING)
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome) - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD2](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**websockets**](https://github.com/aaugustin/websockets) - For downloading over websocket. Licensed under [BSD3](https://github.com/aaugustin/websockets/blob/main/LICENSE)
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen is not present. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**mutagen**](https://github.com/quodlibet/mutagen)\* - For embedding thumbnail in certain formats. Licensed under [GPLv2+](https://github.com/quodlibet/mutagen/blob/master/COPYING)
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome)\* - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD-2-Clause](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**websockets**](https://github.com/aaugustin/websockets)\* - For downloading over websocket. Licensed under [BSD-3-Clause](https://github.com/aaugustin/websockets/blob/main/LICENSE)
* [**secretstorage**](https://github.com/mitya57/secretstorage)\* - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* [**brotli**](https://github.com/google/brotli)\* or [**brotlicffi**](https://github.com/python-hyper/brotlicffi) - [Brotli](https://en.wikipedia.org/wiki/Brotli) content encoding support. Both licensed under MIT <sup>[1](https://github.com/google/brotli/blob/master/LICENSE) [2](https://github.com/python-hyper/brotlicffi/blob/master/LICENSE) </sup>
* [**certifi**](https://github.com/certifi/python-certifi)\* - Provides Mozilla's root certificate bundle. Licensed under [MPLv2](https://github.com/certifi/python-certifi/blob/master/LICENSE)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen/ffmpeg cannot. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**rtmpdump**](http://rtmpdump.mplayerhq.hu) - For downloading `rtmp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](http://rtmpdump.mplayerhq.hu)
* [**mplayer**](http://mplayerhq.hu/design7/info.html) or [**mpv**](https://mpv.io) - For downloading `rstp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](https://github.com/mpv-player/mpv/blob/master/Copyright)
* [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD3](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD)
* [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD-3-Clause](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD)
* [**sponskrub**](https://github.com/faissaloo/SponSkrub) - For using the now **deprecated** [sponskrub options](#sponskrub-options). Licensed under [GPLv3+](https://github.com/faissaloo/SponSkrub/blob/master/LICENCE.md)
* Any external downloader that you want to use with `--downloader`
To use or redistribute the dependencies, you must agree to their respective licensing terms.
The Windows and MacOS standalone release binaries are already built with the python interpreter, mutagen, pycryptodomex and websockets included.
The Windows and MacOS standalone release binaries are already built with the python interpreter and all optional python packages (marked with \*) included.
<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
**Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
## COMPILE
**For Windows**:
To build the Windows executable, you must have pyinstaller (and optionally mutagen, pycryptodomex, websockets). Once you have all the necessary dependencies installed, (optionally) build lazy extractors using `devscripts/make_lazy_extractors.py`, and then just run `pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it.
To build the Windows executable, you must have pyinstaller (and any of yt-dlp's optional dependencies if needed). Once you have all the necessary dependencies installed, (optionally) build lazy extractors using `devscripts/make_lazy_extractors.py`, and then just run `pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it.
py -m pip install -U pyinstaller -r requirements.txt
py devscripts/make_lazy_extractors.py
@@ -365,8 +370,7 @@ You can also fork the project on github and run your fork's [build workflow](.gi
available. Pass the minimum number of
seconds (or range) to wait between retries
--no-wait-for-video Do not wait for scheduled streams (default)
--mark-watched Mark videos watched (even with --simulate).
Currently only supported for YouTube
--mark-watched Mark videos watched (even with --simulate)
--no-mark-watched Do not mark videos watched (default)
--no-colors Do not emit color codes in output
--compat-options OPTS Options that can help keep compatibility
@@ -428,24 +432,24 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--dateafter DATE Download only videos uploaded on or after
this date. The date formats accepted is the
same as --date
--match-filter FILTER Generic video filter. Any field (see
--match-filters FILTER Generic video filter. Any field (see
"OUTPUT TEMPLATE") can be compared with a
number or a string using the operators
defined in "Filtering formats". You can
also simply specify a field to match if the
field is present and "!field" to check if
the field is not present. In addition,
Python style regular expression matching
can be done using "~=", and multiple
filters can be checked with "&". Use a "\"
to escape "&" or quotes if needed. Eg:
--match-filter "!is_live & like_count>?100
& description~='(?i)\bcats \& dogs\b'"
matches only videos that are not live, has
a like count more than 100 (or the like
field is not available), and also has a
description that contains the phrase "cats
& dogs" (ignoring case)
field is present, use "!field" to check if
the field is not present, and "&" to check
multiple conditions. Use a "\" to escape
"&" or quotes if needed. If used multiple
times, the filter matches if atleast one of
the conditions are met. Eg: --match-filter
!is_live --match-filter "like_count>?100 &
description~='(?i)\bcats \& dogs\b'"
matches only videos that are not live OR
those that have a like count more than 100
(or the like field is not available) and
also has a description that contains the
phrase "cats & dogs" (ignoring case)
--no-match-filter Do not use generic video filter (default)
--no-playlist Download only the video, if the URL refers
to a video and a playlist
@@ -605,11 +609,11 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--write-description etc. (default)
--no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc.
--clean-infojson Remove some private fields such as
--clean-info-json Remove some private fields such as
filenames from the infojson. Note that it
could still contain some personal
information (default)
--no-clean-infojson Write all fields to the infojson
--no-clean-info-json Write all fields to the infojson
--write-comments Retrieve video comments to be placed in the
infojson. The comments are fetched even
without this option if the extraction is
@@ -737,9 +741,6 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--prefer-insecure Use an unencrypted connection to retrieve
information about the video (Currently
supported only for YouTube)
--user-agent UA Specify a custom user agent
--referer URL Specify a custom referer, use if the video
access is restricted to one domain
--add-header FIELD:VALUE Specify a custom HTTP header and its value,
separated by a colon ":". You can use this
option multiple times
@@ -782,8 +783,8 @@ You can also fork the project on github and run your fork's [build workflow](.gi
containers irrespective of quality
--no-prefer-free-formats Don't give any special preference to free
containers (default)
--check-formats Check that the selected formats are
actually downloadable
--check-formats Make sure formats are selected only from
those that are actually downloadable
--check-all-formats Check all formats for whether they are
actually downloadable
--no-check-formats Do not check that the formats are actually
@@ -840,15 +841,17 @@ You can also fork the project on github and run your fork's [build workflow](.gi
(requires ffmpeg and ffprobe)
--audio-format FORMAT Specify audio format to convert the audio
to when -x is used. Currently supported
formats are: best (default) or one of
best|aac|flac|mp3|m4a|opus|vorbis|wav|alac
--audio-quality QUALITY Specify ffmpeg audio quality, insert a
formats are: best (default) or one of aac,
flac, mp3, m4a, opus, vorbis, wav, alac
--audio-quality QUALITY Specify ffmpeg audio quality to use when
converting the audio with -x. Insert a
value between 0 (best) and 10 (worst) for
VBR or a specific bitrate like 128K
(default 5)
--remux-video FORMAT Remux the video into another container if
necessary (currently supported: mp4|mkv|flv
|webm|mov|avi|mp3|mka|m4a|ogg|opus). If
necessary (currently supported: mp4, mkv,
flv, webm, mov, avi, mka, ogg, aac, flac,
mp3, m4a, opus, vorbis, wav, alac). If
target container does not support the
video/audio codec, remuxing will fail. You
can specify multiple rules; Eg.
@@ -948,10 +951,10 @@ You can also fork the project on github and run your fork's [build workflow](.gi
option can be used multiple times
--no-exec Remove any previously defined --exec
--convert-subs FORMAT Convert the subtitles to another format
(currently supported: srt|vtt|ass|lrc)
(currently supported: srt, vtt, ass, lrc)
(Alias: --convert-subtitles)
--convert-thumbnails FORMAT Convert the thumbnails to another format
(currently supported: jpg|png)
(currently supported: jpg, png, webp)
--split-chapters Split video into multiple files based on
internal chapters. The "chapter:" prefix
can be used with "--paths" and "--output"
@@ -982,15 +985,17 @@ You can also fork the project on github and run your fork's [build workflow](.gi
semicolon ";" delimited list of NAME=VALUE.
The "when" argument determines when the
postprocessor is invoked. It can be one of
"pre_process" (after extraction),
"before_dl" (before video download),
"post_process" (after video download;
default), "after_move" (after moving file
to their final locations), "after_video"
(after downloading and processing all
formats of a video), or "playlist" (end of
playlist). This option can be used multiple
times to add different postprocessors
"pre_process" (after video extraction),
"after_filter" (after video passes filter),
"before_dl" (before each video download),
"post_process" (after each video download;
default), "after_move" (after moving video
file to it's final locations),
"after_video" (after downloading and
processing all formats of a video), or
"playlist" (at end of playlist). This
option can be used multiple times to add
different postprocessors
## SponsorBlock Options:
Make chapter entries for, or remove various segments (sponsor,
@@ -1153,11 +1158,11 @@ The available fields are:
- `license` (string): License name the video is licensed under
- `creator` (string): The creator of the video
- `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date` (string): Video upload date (YYYYMMDD)
- `upload_date` (string): Video upload date in UTC (YYYYMMDD)
- `release_timestamp` (numeric): UNIX timestamp of the moment the video was released
- `release_date` (string): The date (YYYYMMDD) when the video was released
- `release_date` (string): The date (YYYYMMDD) when the video was released in UTC
- `modified_timestamp` (numeric): UNIX timestamp of the moment the video was last modified
- `modified_date` (string): The date (YYYYMMDD) when the video was last modified
- `modified_date` (string): The date (YYYYMMDD) when the video was last modified in UTC
- `uploader_id` (string): Nickname or id of the video uploader
- `channel` (string): Full name of the channel the video is uploaded on
- `channel_id` (string): Id of the channel
@@ -1362,7 +1367,7 @@ You can also use special names to select particular edge case formats:
- `bv`, `bestvideo`: Select the best quality **video-only** format. Equivalent to `best*[acodec=none]`
- `bv*`, `bestvideo*`: Select the best quality format that **contains video**. It may also contain audio. Equivalent to `best*[vcodec!=none]`
- `ba`, `bestaudio`: Select the best quality **audio-only** format. Equivalent to `best*[vcodec=none]`
- `ba*`, `bestaudio*`: Select the best quality format that **contains audio**. It may also contain video. Equivalent to `best*[acodec!=none]`
- `ba*`, `bestaudio*`: Select the best quality format that **contains audio**. It may also contain video. Equivalent to `best*[acodec!=none]` ([Do not use!](https://github.com/yt-dlp/yt-dlp/issues/979#issuecomment-919629354))
- `w*`, `worst*`: Select the worst quality format that contains either a video or an audio
- `w`, `worst`: Select the worst quality format that contains both video and audio. Equivalent to `worst*[vcodec!=none][acodec!=none]`
- `wv`, `worstvideo`: Select the worst quality video-only format. Equivalent to `worst*[acodec=none]`
@@ -1370,7 +1375,7 @@ You can also use special names to select particular edge case formats:
- `wa`, `worstaudio`: Select the worst quality audio-only format. Equivalent to `worst*[vcodec=none]`
- `wa*`, `worstaudio*`: Select the worst quality format that contains audio. It may also contain video. Equivalent to `worst*[acodec!=none]`
For example, to download the worst quality video-only format you can use `-f worstvideo`. It is however recommended not to use `worst` and related options. When your format selector is `worst`, the format which is worst in all respects is selected. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps` instead of `-f worst`. See [sorting formats](#sorting-formats) for more details.
For example, to download the worst quality video-only format you can use `-f worstvideo`. It is however recommended not to use `worst` and related options. When your format selector is `worst`, the format which is worst in all respects is selected. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-S +size` or more rigorously, `-S +size,+br,+res,+fps` instead of `-f worst`. See [sorting formats](#sorting-formats) for more details.
You can select the n'th best format of a type by using `best<type>.<n>`. For example, `best.2` will select the 2nd best combined format. Similarly, `bv*.3` will select the 3rd best format that contains a video stream.
@@ -1399,7 +1404,7 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields:
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields:
- `ext`: File extension
- `acodec`: Name of the audio codec in use
@@ -1409,7 +1414,7 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
- `format_id`: A short description of the format
- `language`: Language code
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
@@ -1552,8 +1557,9 @@ $ yt-dlp -S "proto"
# Download the best video with h264 codec, or the best video if there is no such video
$ yt-dlp -f "(bv*[vcodec^=avc1]+ba) / (bv*+ba/b)"
# Download the best video with either h264 or h265 codec,
# or the best video if there is no such video
$ yt-dlp -f "(bv*[vcodec~='^((he|a)vc|h26[45])']+ba) / (bv*+ba/b)"
# Download the best video with best codec no better than h264,
# or the best video with worst codec if there is no such video
@@ -1598,25 +1604,28 @@ This option also has a few special uses:
* You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description
* You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values.
**Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes.
For reference, these are the fields yt-dlp adds by default to the file metadata:
Metadata fields|From
:---|:---
`title`|`track` or `title`
`date`|`upload_date`
`description`, `synopsis`|`description`
`purl`, `comment`|`webpage_url`
`track`|`track_number`
`artist`|`artist`, `creator`, `uploader` or `uploader_id`
`genre`|`genre`
`album`|`album`
`album_artist`|`album_artist`
`disc`|`disc_number`
`show`|`series`
`season_number`|`season_number`
`episode_id`|`episode` or `episode_id`
`episode_sort`|`episode_number`
`language` of each stream|From the format's `language`
Metadata fields | From
:--------------------------|:------------------------------------------------
`title` | `track` or `title`
`date` | `upload_date`
`description`, `synopsis` | `description`
`purl`, `comment` | `webpage_url`
`track` | `track_number`
`artist` | `artist`, `creator`, `uploader` or `uploader_id`
`genre` | `genre`
`album` | `album`
`album_artist` | `album_artist`
`disc` | `disc_number`
`show` | `series`
`season_number` | `season_number`
`episode_id` | `episode` or `episode_id`
`episode_sort` | `episode_number`
`language` of each stream | the format's `language`
**Note**: The file format may not support some of these fields
@@ -1632,7 +1641,11 @@ $ yt-dlp --parse-metadata "description:Artist - (?P<artist>.+)"
# Set title as "Series name S01E05"
$ yt-dlp --parse-metadata "%(series)s S%(season_number)02dE%(episode_number)02d:%(title)s"
# Set "comment" field in video metadata using description instead of webpage_url
# Prioritize uploader as the "artist" field in video metadata
$ yt-dlp --parse-metadata "%(uploader|)s:%(meta_artist)s" --add-metadata
# Set "comment" field in video metadata using description instead of webpage_url,
# handling multiple lines correctly
$ yt-dlp --parse-metadata "description:(?s)(?P<meta_comment>.+)" --add-metadata
# Remove "formats" field from the infojson by setting it to an empty string
@@ -1645,23 +1658,22 @@ $ yt-dlp --replace-in-metadata "title,uploader" "[ _]" "-"
# EXTRACTOR ARGUMENTS
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. Eg: `--extractor-args "youtube:player-client=android_agegate,web;include_live_dash" --extractor-args "funimation:version=uncut"`
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. Eg: `--extractor-args "youtube:player-client=android_embedded,web;include_live_dash" --extractor-args "funimation:version=uncut"`
The following extractors use this feature:
#### youtube
* `skip`: `hls` or `dash` (or both) to skip download of the respective manifests
* `player_client`: Clients to extract video data from. The main clients are `web`, `android`, `ios`, `mweb`. These also have `_music`, `_embedded`, `_agegate`, and `_creator` variants (Eg: `web_embedded`) (`mweb` has only `_agegate`). By default, `android,web` is used, but the agegate and creator variants are added as required for age-gated videos. Similarly the music variants are added for `music.youtube.com` urls. You can also use `all` to use all the clients, and `default` for the default clients.
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and auto-translated subtitles respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (Eg: `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `android,web` is used, but tv_embedded and creator variants are added as required for age-gated videos. Similarly the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `include_live_dash`: Include live dash formats even without `--live-from-start` (These formats don't download properly)
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`.
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total.
* `max_comment_depth` Maximum depth for nested comments. YouTube supports depths 1 or 2 (default)
* **Deprecated**: Set `max-replies` to `0` or `all` in `max_comments` instead (e.g. `max_comments=all,all,0` to get no replies)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
#### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
* `approximate_date`: Extract approximate `upload_date` in flat-playlist. This may cause date-based filters to be slightly off
#### funimation
* `language`: Languages to extract. Eg: `funimation:language=english,japanese`
@@ -1671,7 +1683,7 @@ The following extractors use this feature:
* `language`: Languages to extract. Eg: `crunchyroll:language=jaJp`
* `hardsub`: Which hard-sub versions to extract. Eg: `crunchyroll:hardsub=None,enUS`
#### crunchyroll:beta
#### crunchyrollbeta
* `format`: Which stream type(s) to extract. Default is `adaptive_hls` Eg: `crunchyrollbeta:format=vo_adaptive_hls`
* Potentially useful values include `adaptive_hls`, `adaptive_dash`, `vo_adaptive_hls`, `vo_adaptive_dash`, `download_hls`, `trailer_hls`, `trailer_dash`
* `hardsub`: Preference order for which hardsub versions to extract. Default is `None` (no hardsubs). Eg: `crunchyrollbeta:hardsub=en-US,None`
@@ -1679,6 +1691,9 @@ The following extractors use this feature:
#### vikichannel
* `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers`
#### niconico
* `segment_duration`: Segment duration in milliseconds for HLS-DMC formats. Use it at your own risk since this feature **may result in your account termination.**
#### youtubewebarchive
* `check_all`: Try to check more at the cost of more requests. One or more of `thumbnails`, `captures`
@@ -1694,6 +1709,10 @@ The following extractors use this feature:
* `app_version`: App version to call mobile APIs with - should be set along with `manifest_app_version`. (e.g. `20.2.1`)
* `manifest_app_version`: Numeric app version to call mobile APIs with. (e.g. `221`)
#### rokfinchannel
* `tab`: Which tab to download. One of `new`, `top`, `videos`, `podcasts`, `streams`, `stacks`. (E.g. `rokfinchannel:tab=streams`)
NOTE: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
@@ -1729,7 +1748,7 @@ with YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of options available, have a look at [`yt_dlp/YoutubeDL.py`](yt_dlp/YoutubeDL.py#L191).
Most likely, you'll want to use various options. For a list of options available, have a look at [`yt_dlp/YoutubeDL.py`](yt_dlp/YoutubeDL.py#L197).
Here's a more complete example demonstrating various functionality:
@@ -1810,12 +1829,11 @@ ydl_opts = {
}],
'logger': MyLogger(),
'progress_hooks': [my_hook],
# Add custom headers
'http_headers': {'Referer': 'https://www.google.com'}
}
# Add custom headers
yt_dlp.utils.std_headers.update({'Referer': 'https://www.google.com'})
# See the public functions in yt_dlp.YoutubeDL for for other available functions.
# Eg: "ydl.download", "ydl.download_with_info_file"
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
@@ -1858,6 +1876,8 @@ While these options are redundant, they are still expected to be used due to the
--reject-title REGEX --match-filter "title !~= (?i)REGEX"
--min-views COUNT --match-filter "view_count >=? COUNT"
--max-views COUNT --match-filter "view_count <=? COUNT"
--user-agent UA --add-header "User-Agent:UA"
--referer URL --add-header "Referer:URL"
#### Not recommended
@@ -1895,11 +1915,13 @@ These options are not intended to be used by the end-user
These are aliases that are no longer documented for various reasons
--avconv-location --ffmpeg-location
--clean-infojson --clean-info-json
--cn-verification-proxy URL --geo-verification-proxy URL
--dump-headers --print-traffic
--dump-intermediate-pages --dump-pages
--force-write-download-archive --force-write-archive
--load-info --load-info-json
--no-clean-infojson --no-clean-info-json
--no-split-tracks --no-split-chapters
--no-write-srt --no-write-subs
--prefer-unsecure --prefer-insecure

View File

@@ -24,10 +24,9 @@ def main():
def gen_ies_md(ies):
for ie in ies:
ie_md = '**{0}**'.format(ie.IE_NAME)
ie_desc = getattr(ie, 'IE_DESC', None)
if ie_desc is False:
if ie.IE_DESC is False:
continue
if ie_desc is not None:
if ie.IE_DESC is not None:
ie_md += ': {0}'.format(ie.IE_DESC)
search_key = getattr(ie, 'SEARCH_KEY', None)
if search_key is not None:

View File

@@ -75,21 +75,21 @@ def filter_options(readme):
section = re.search(r'(?sm)^# USAGE AND OPTIONS\n.+?(?=^# )', readme).group(0)
options = '# OPTIONS\n'
for line in section.split('\n')[1:]:
if line.lstrip().startswith('-'):
split = re.split(r'\s{2,}', line.lstrip())
# Description string may start with `-` as well. If there is
# only one piece then it's a description bit not an option.
if len(split) > 1:
option, description = split
split_option = option.split(' ')
if not split_option[-1].startswith('-'): # metavar
option = ' '.join(split_option[:-1] + [f'*{split_option[-1]}*'])
mobj = re.fullmatch(r'''(?x)
\s{4}(?P<opt>-(?:,\s|[^\s])+)
(?:\s(?P<meta>(?:[^\s]|\s(?!\s))+))?
(\s{2,}(?P<desc>.+))?
''', line)
if not mobj:
options += f'{line.lstrip()}\n'
continue
option, metavar, description = mobj.group('opt', 'meta', 'desc')
# Pandoc's definition_lists. See http://pandoc.org/README.html
options += f'\n{option}\n: {description}\n'
option = f'{option} *{metavar}*' if metavar else option
description = f'{description}\n' if description else ''
options += f'\n{option}\n: {description}'
continue
options += line.lstrip() + '\n'
return readme.replace(section, options, 1)

1
docs/.gitignore vendored
View File

@@ -1 +0,0 @@
_build/

View File

@@ -1,5 +0,0 @@
---
orphan: true
---
```{include} ../Changelog.md
```

View File

@@ -1,5 +0,0 @@
---
orphan: true
---
```{include} ../Collaborators.md
```

View File

@@ -1,5 +0,0 @@
---
orphan: true
---
```{include} ../Contributing.md
```

View File

@@ -1,6 +0,0 @@
---
orphan: true
---
# LICENSE
```{include} ../LICENSE
```

View File

@@ -1,177 +0,0 @@
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = _build
# User-friendly check for sphinx-build
ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
endif
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
# the i18n builder cannot share the environment and doctrees with the others
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " texinfo to make Texinfo files"
@echo " info to make Texinfo files and run them through makeinfo"
@echo " gettext to make PO message catalogs"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " xml to make Docutils-native XML files"
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
clean:
rm -rf $(BUILDDIR)/*
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in $(BUILDDIR)/htmlhelp."
qthelp:
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/yt-dlp.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/yt-dlp.qhc"
devhelp:
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/yt-dlp"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/yt-dlp"
@echo "# devhelp"
epub:
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
@echo
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
latex:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
@echo "Run \`make' in that directory to run these through (pdf)latex" \
"(use \`make latexpdf' here to do that automatically)."
latexpdf:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through pdflatex..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
latexpdfja:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through platex and dvipdfmx..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
text:
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
@echo
@echo "Build finished. The text files are in $(BUILDDIR)/text."
man:
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
@echo
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
texinfo:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo
@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
@echo "Run \`make' in that directory to run these through makeinfo" \
"(use \`make info' here to do that automatically)."
info:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo "Running Texinfo files through makeinfo..."
make -C $(BUILDDIR)/texinfo info
@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
gettext:
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
@echo
@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
changes:
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
@echo
@echo "The overview file is in $(BUILDDIR)/changes."
linkcheck:
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in $(BUILDDIR)/linkcheck/output.txt."
doctest:
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
xml:
$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
@echo
@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
pseudoxml:
$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
@echo
@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."

View File

@@ -1,2 +0,0 @@
```{include} ../README.md
```

View File

@@ -1,68 +0,0 @@
# coding: utf-8
#
# yt-dlp documentation build configuration file
import sys
import os
# Allows to import yt-dlp
sys.path.insert(0, os.path.abspath('..'))
# -- General configuration ------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'myst_parser',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The master toctree document.
master_doc = 'README'
# General information about the project.
project = u'yt-dlp'
author = u'yt-dlp'
copyright = u'UNLICENSE'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
from yt_dlp.version import __version__
version = __version__
# The full version, including alpha/beta/rc tags.
release = version
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'default'
# Disable highlights
highlight_language = 'none'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = ['_static']
# Enable heading anchors
myst_heading_anchors = 4
# Suppress heading warnings
suppress_warnings = [
'myst.header',
]

View File

@@ -1 +0,0 @@
myst-parser

View File

@@ -1,5 +0,0 @@
---
orphan: true
---
```{include} ../supportedsites.md
```

View File

@@ -1,6 +0,0 @@
---
orphan: true
---
# ytdlp_plugins
See [https://github.com/yt-dlp/yt-dlp/tree/master/ytdlp_plugins](https://github.com/yt-dlp/yt-dlp/tree/master/ytdlp_plugins).

View File

@@ -74,7 +74,7 @@ def version_to_list(version):
def dependency_options():
dependencies = [pycryptodome_module(), 'mutagen'] + collect_submodules('websockets')
dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi'] + collect_submodules('websockets')
excluded_modules = ['test', 'ytdlp_plugins', 'youtube-dl', 'youtube-dlc']
yield from (f'--hidden-import={module}' for module in dependencies)

View File

@@ -1,3 +1,6 @@
mutagen
pycryptodomex
websockets
brotli; platform_python_implementation=='CPython'
brotlicffi; platform_python_implementation!='CPython'
certifi

View File

@@ -21,9 +21,9 @@ DESCRIPTION = 'A youtube-dl fork with additional features and patches'
LONG_DESCRIPTION = '\n\n'.join((
'Official repository: <https://github.com/yt-dlp/yt-dlp>',
'**PS**: Some links in this document will not work since this is a copy of the README.md from Github',
open('README.md', 'r', encoding='utf-8').read()))
open('README.md', encoding='utf-8').read()))
REQUIREMENTS = ['mutagen', 'pycryptodomex', 'websockets']
REQUIREMENTS = open('requirements.txt', encoding='utf-8').read().splitlines()
if sys.argv[1:2] == ['py2exe']:

View File

@@ -3,7 +3,6 @@
- **17live:clip**
- **1tv**: Первый канал
- **20min**
- **220.ro**
- **23video**
- **247sports**
- **24video**
@@ -11,7 +10,6 @@
- **3sat**
- **4tube**
- **56.com**
- **5min**
- **6play**
- **7plus**
- **8tracks**
@@ -26,6 +24,8 @@
- **abcnews:video**
- **abcotvs**: ABC Owned Television Stations
- **abcotvs:clips**
- **AbemaTV**
- **AbemaTVTitle**
- **AcademicEarth:Course**
- **acast**
- **acast:channel**
@@ -42,11 +42,14 @@
- **aenetworks:show**
- **afreecatv**: afreecatv.com
- **afreecatv:live**: afreecatv.com
- **afreecatv:user**
- **AirMozilla**
- **AliExpressLive**
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
- **Alsace20TV**
- **Alsace20TVEmbed**
- **Alura**
- **AluraCourse**
- **Amara**
@@ -60,6 +63,9 @@
- **AnimeLab**
- **AnimeLabShows**
- **AnimeOnDemand**
- **ant1newsgr:article**: ant1news.gr articles
- **ant1newsgr:embed**: ant1news.gr embedded videos
- **ant1newsgr:watch**: ant1news.gr videos
- **Anvato**
- **aol.com**: Yahoo screen and movies
- **APA**
@@ -77,6 +83,7 @@
- **Arkena**
- **arte.sky.it**
- **ArteTV**
- **ArteTVCategory**
- **ArteTVEmbed**
- **ArteTVPlaylist**
- **AsianCrush**
@@ -98,11 +105,13 @@
- **awaan:video**
- **AZMedien**: AZ Medien videos
- **BaiduVideo**: 百度视频
- **BanBye**
- **BanByeChannel**
- **bandaichannel**
- **Bandcamp**
- **Bandcamp:album**
- **Bandcamp:user**
- **Bandcamp:weekly**
- **BandcampMusic**
- **bangumi.bilibili.com**: BiliBili番剧
- **BannedVideo**
- **bbc**: BBC
@@ -124,6 +133,7 @@
- **bfmtv:live**
- **BibelTV**
- **Bigflix**
- **Bigo**
- **Bild**: Bild.de
- **BiliBili**
- **Bilibili category extractor**
@@ -165,6 +175,7 @@
- **BYUtv**
- **CableAV**
- **Callin**
- **Caltrans**
- **CAM4**
- **Camdemy**
- **CamdemyFolder**
@@ -233,8 +244,11 @@
- **Coub**
- **CozyTV**
- **cp24**
- **cpac**
- **cpac:playlist**
- **Cracked**
- **Crackle**
- **Craftsy**
- **CrooksAndLiars**
- **CrowdBunker**
- **CrowdBunkerChannel**
@@ -243,6 +257,7 @@
- **crunchyroll:playlist**
- **crunchyroll:playlist:beta**
- **CSpan**: C-SPAN
- **CSpanCongress**
- **CtsNews**: 華視新聞
- **CTV**
- **CTVNews**
@@ -252,6 +267,8 @@
- **curiositystream:collections**
- **curiositystream:series**
- **CWTV**
- **Cybrary**
- **CybraryCourse**
- **Daftsex**
- **DagelijkseKost**: dagelijksekost.een.be
- **DailyMail**
@@ -264,6 +281,7 @@
- **daum.net:clip**
- **daum.net:playlist**
- **daum.net:user**
- **daystar:clip**
- **DBTV**
- **DctpTv**
- **DeezerAlbum**
@@ -355,6 +373,7 @@
- **faz.net**
- **fc2**
- **fc2:embed**
- **fc2:live**
- **Fczenit**
- **Filmmodu**
- **filmon**
@@ -374,6 +393,7 @@
- **foxnews**: Fox News and Fox Business Video
- **foxnews:article**
- **FoxSports**
- **fptplay**: fptplay.vn
- **FranceCulture**
- **FranceInter**
- **FranceTV**
@@ -381,7 +401,6 @@
- **FranceTVSite**
- **Freesound**
- **freespeech.org**
- **FreshLive**
- **FrontendMasters**
- **FrontendMastersCourse**
- **FrontendMastersLesson**
@@ -413,6 +432,7 @@
- **gem.cbc.ca:playlist**
- **generic**: Generic downloader that works on some sites
- **Gettr**
- **GettrStreaming**
- **Gfycat**
- **GiantBomb**
- **Giga**
@@ -454,7 +474,6 @@
- **hitbox:live**
- **HitRecord**
- **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau
- **HornBunny**
- **HotNewHipHop**
- **hotstar**
- **hotstar:playlist**
@@ -471,6 +490,7 @@
- **Hungama**
- **HungamaAlbumPlaylist**
- **HungamaSong**
- **huya:live**: huya.com
- **Hypem**
- **ign.com**
- **IGNArticle**
@@ -499,7 +519,8 @@
- **iq.com**: International version of iQiyi
- **iq.com:album**
- **iqiyi**: 爱奇艺
- **Ir90Tv**
- **ITProTV**
- **ITProTVCourse**
- **ITTF**
- **ITV**
- **ITVBTCC**
@@ -508,6 +529,8 @@
- **ivideon**: Ivideon TV
- **Iwara**
- **Izlesene**
- **Jable**
- **JablePlaylist**
- **Jamendo**
- **JamendoAlbum**
- **JeuxVideo**
@@ -516,7 +539,6 @@
- **JWPlatform**
- **Kakao**
- **Kaltura**
- **Kankan**
- **Karaoketv**
- **KarriereVideos**
- **Katsomo**
@@ -544,6 +566,9 @@
- **la7.it:podcast**
- **laola1tv**
- **laola1tv:embed**
- **LastFM**
- **LastFMPlaylist**
- **LastFMUser**
- **lbry**
- **lbry:channel**
- **LCI**
@@ -592,6 +617,7 @@
- **MallTV**
- **mangomolo:live**
- **mangomolo:video**
- **MangoTV**: 芒果TV
- **ManotoTV**: Manoto TV (Episode)
- **ManotoTVLive**: Manoto TV (Live)
- **ManotoTVShow**: Manoto TV (Show)
@@ -624,12 +650,12 @@
- **Metacritic**
- **mewatch**
- **Mgoon**
- **MGTV**: 芒果TV
- **MiaoPai**
- **microsoftstream**: Microsoft Stream
- **mildom**: Record ongoing live by specific user in Mildom
- **mildom:clip**: Clip in Mildom
- **mildom:user:vod**: Download all VODs from specific user in Mildom
- **mildom:vod**: Download a VOD in Mildom
- **mildom:vod**: VOD in Mildom
- **minds**
- **minds:channel**
- **minds:group**
@@ -659,6 +685,7 @@
- **Motorsport**: motorsport.com
- **MovieClips**
- **MovieFap**
- **Moviepilot**
- **Moviezine**
- **MovingImage**
- **MSN**
@@ -672,6 +699,8 @@
- **mtvservices:embedded**
- **MTVUutisetArticle**
- **MuenchenTV**: münchen.tv
- **Murrtube**
- **MurrtubeUser**: Murrtube user profile
- **MuseScore**
- **MusicdexAlbum**
- **MusicdexArtist**
@@ -691,7 +720,6 @@
- **MyVideoGe**
- **MyVidster**
- **MyviEmbed**
- **MyVisionTV**
- **n-tv.de**
- **N1Info:article**
- **N1InfoAsset**
@@ -740,9 +768,13 @@
- **NextTV**: 壹電視
- **Nexx**
- **NexxEmbed**
- **NFB**
- **NFHSNetwork**
- **nfl.com** (Currently broken)
- **nfl.com:article** (Currently broken)
- **NhkForSchoolBangumi**
- **NhkForSchoolProgramList**
- **NhkForSchoolSubject**: Portal page for each school subjects, like Japanese (kokugo, 国語) or math (sansuu/suugaku or 算数・数学)
- **NhkVod**
- **NhkVodProgram**
- **nhl.com**
@@ -752,7 +784,10 @@
- **nickelodeonru**
- **nicknight**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **niconico:history**: NicoNico user history. Requires cookies.
- **niconico:playlist**
- **niconico:series**
- **niconico:tag**: NicoNico video tag URLs
- **NiconicoUser**
- **nicovideo:search**: Nico video search; "nicosearch:" prefix
- **nicovideo:search:date**: Nico video search, newest first; "nicosearchdate:" prefix
@@ -842,6 +877,9 @@
- **PalcoMP3:song**
- **PalcoMP3:video**
- **pandora.tv**: 판도라TV
- **Panopto**
- **PanoptoList**
- **PanoptoPlaylist**
- **ParamountNetwork**
- **ParamountPlus**
- **ParamountPlusSeries**
@@ -851,6 +889,7 @@
- **PatreonUser**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **PearVideo**
- **PeekVids**
- **peer.tv**
- **PeerTube**
- **PeerTube:Playlist**
@@ -863,6 +902,7 @@
- **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de**
- **Photobucket**
- **Piapro**
- **Picarto**
- **PicartoVod**
- **Piksel**
@@ -882,12 +922,14 @@
- **PlaysTV**
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
- **Playvid**
- **PlayVids**
- **Playwire**
- **pluralsight**
- **pluralsight:course**
- **PlutoTV**
- **podomatic**
- **Pokemon**
- **PokemonSoundLibrary**
- **PokemonWatch**
- **PokerGo**
- **PokerGoCollection**
@@ -933,8 +975,6 @@
- **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuantumTV**
- **Qub**
- **Quickline**
- **QuicklineLive**
- **R7**
- **R7Article**
- **Radiko**
@@ -986,10 +1026,12 @@
- **RICE**
- **RMCDecouverte**
- **RockstarGames**
- **Rokfin**
- **rokfin:channel**
- **rokfin:stack**
- **RoosterTeeth**
- **RoosterTeethSeries**
- **RottenTomatoes**
- **Roxwel**
- **Rozhlas**
- **RTBF**
- **RTDocumentry**
@@ -1026,6 +1068,7 @@
- **RUTV**: RUTV.RU
- **Ruutu**
- **Ruv**
- **ruv.is:spila**
- **safari**: safaribooksonline.com online video
- **safari:api**
- **safari:course**: safaribooksonline.com online courses
@@ -1165,6 +1208,7 @@
- **TeleBruxelles**
- **Telecinco**: telecinco.es, cuatro.com and mediaset.es
- **Telegraaf**
- **telegram:embed**
- **TeleMB**
- **Telemundo**
- **TeleQuebec**
@@ -1181,7 +1225,6 @@
- **TheIntercept**
- **ThePlatform**
- **ThePlatformFeed**
- **TheScene**
- **TheStar**
- **TheSun**
- **ThetaStream**
@@ -1327,6 +1370,8 @@
- **video.google:search**: Google Video search; "gvsearch:" prefix
- **video.sky.it**
- **video.sky.it:live**
- **VideocampusSachsen**
- **VideocampusSachsenEmbed**
- **VideoDetective**
- **videofy.me**
- **videomore**
@@ -1369,6 +1414,7 @@
- **vlive**
- **vlive:channel**
- **vlive:post**
- **vm.tiktok**
- **Vodlocker**
- **VODPl**
- **VODPlatform**
@@ -1388,7 +1434,6 @@
- **VShare**
- **VTM**
- **VTXTV**
- **vube**: Vube.com
- **VuClip**
- **Vupload**
- **VVVVID**
@@ -1398,13 +1443,16 @@
- **Wakanim**
- **Walla**
- **WalyTV**
- **wasdtv:clip**
- **wasdtv:record**
- **wasdtv:stream**
- **washingtonpost**
- **washingtonpost:article**
- **wat.tv**
- **WatchBox**
- **WatchIndianPorn**: Watch Indian Porn
- **WDR**
- **wdr:mobile**
- **wdr:mobile** (Currently broken)
- **WDRElefant**
- **WDRPage**
- **web.archive:youtube**: web.archive.org saved youtube videos, "ytarchive:" prefix
@@ -1439,6 +1487,7 @@
- **xiami:song**: 虾米音乐
- **ximalaya**: 喜马拉雅FM
- **ximalaya:album**: 喜马拉雅FM 专辑
- **xinpianchang**: xinpianchang.com
- **XMinus**
- **XNXX**
- **Xstream**
@@ -1490,6 +1539,8 @@
- **Zapiks**
- **Zattoo**
- **ZattooLive**
- **ZattooMovies**
- **ZattooRecordings**
- **ZDF**
- **ZDFChannel**
- **Zee5**
@@ -1497,7 +1548,7 @@
- **ZenYandex**
- **ZenYandexChannel**
- **Zhihu**
- **zingmp3**: mp3.zing.vn
- **zingmp3**: zingmp3.vn
- **zingmp3:album**
- **zoom**
- **Zype**

View File

@@ -196,15 +196,7 @@ def expect_dict(self, got_dict, expected_dict):
def sanitize_got_info_dict(got_dict):
IGNORED_FIELDS = (
# Format keys
'url', 'manifest_url', 'format', 'format_id', 'format_note', 'width', 'height', 'resolution',
'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'vbr', 'fps', 'vcodec', 'container', 'filesize',
'filesize_approx', 'player_url', 'protocol', 'fragment_base_url', 'fragments', 'preference',
'language', 'language_preference', 'quality', 'source_preference', 'http_headers',
'stretched_ratio', 'no_resume', 'has_drm', 'downloader_options',
# RTMP formats
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time',
*YoutubeDL._format_fields,
# Lists
'formats', 'thumbnails', 'subtitles', 'automatic_captions', 'comments', 'entries',

View File

@@ -30,9 +30,7 @@ class YDL(FakeYDL):
self.msgs = []
def process_info(self, info_dict):
info_dict = info_dict.copy()
info_dict.pop('__original_infodict', None)
self.downloaded_info_dicts.append(info_dict)
self.downloaded_info_dicts.append(info_dict.copy())
def to_screen(self, msg):
self.msgs.append(msg)
@@ -820,6 +818,8 @@ class TestYoutubeDL(unittest.TestCase):
test('%(id&foo)s.bar', 'foo.bar')
test('%(title&foo)s.bar', 'NA.bar')
test('%(title&foo|baz)s.bar', 'baz.bar')
test('%(x,id&foo|baz)s.bar', 'foo.bar')
test('%(x,title&foo|baz)s.bar', 'baz.bar')
# Laziness
def gen():
@@ -898,20 +898,6 @@ class TestYoutubeDL(unittest.TestCase):
os.unlink(filename)
def test_match_filter(self):
class FilterYDL(YDL):
def __init__(self, *args, **kwargs):
super(FilterYDL, self).__init__(*args, **kwargs)
self.params['simulate'] = True
def process_info(self, info_dict):
super(YDL, self).process_info(info_dict)
def _match_entry(self, info_dict, incomplete=False):
res = super(FilterYDL, self)._match_entry(info_dict, incomplete)
if res is None:
self.downloaded_info_dicts.append(info_dict.copy())
return res
first = {
'id': '1',
'url': TEST_URL,
@@ -939,7 +925,7 @@ class TestYoutubeDL(unittest.TestCase):
videos = [first, second]
def get_videos(filter_=None):
ydl = FilterYDL({'match_filter': filter_})
ydl = YDL({'match_filter': filter_, 'simulate': True})
for v in videos:
ydl.process_ie_result(v, download=True)
return [v['id'] for v in ydl.downloaded_info_dicts]
@@ -947,7 +933,7 @@ class TestYoutubeDL(unittest.TestCase):
res = get_videos()
self.assertEqual(res, ['1', '2'])
def f(v):
def f(v, incomplete):
if v['id'] == '1':
return None
else:

View File

@@ -12,11 +12,6 @@ from test.helper import FakeYDL, is_download_test
from yt_dlp.extractor import IqiyiIE
class IqiyiIEWithCredentials(IqiyiIE):
def _get_login_info(self):
return 'foo', 'bar'
class WarningLogger(object):
def __init__(self):
self.messages = []
@@ -40,8 +35,8 @@ class TestIqiyiSDKInterpreter(unittest.TestCase):
If `sign` is incorrect, /validate call throws an HTTP 556 error
'''
logger = WarningLogger()
ie = IqiyiIEWithCredentials(FakeYDL({'logger': logger}))
ie._login()
ie = IqiyiIE(FakeYDL({'logger': logger}))
ie._perform_login('foo', 'bar')
self.assertTrue('unable to log in:' in logger.messages[0])

View File

@@ -7,18 +7,19 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from yt_dlp.extractor import (
gen_extractors,
)
from yt_dlp.extractor import gen_extractor_classes
from yt_dlp.extractor.common import InfoExtractor
NO_LOGIN = InfoExtractor._perform_login
class TestNetRc(unittest.TestCase):
def test_netrc_present(self):
for ie in gen_extractors():
if not hasattr(ie, '_login'):
for ie in gen_extractor_classes():
if ie._perform_login is NO_LOGIN:
continue
self.assertTrue(
hasattr(ie, '_NETRC_MACHINE'),
ie._NETRC_MACHINE,
'Extractor %s supports login, but is missing a _NETRC_MACHINE property' % ie.IE_NAME)

View File

@@ -56,6 +56,7 @@ from yt_dlp.utils import (
is_html,
js_to_json,
limit_length,
locked_file,
merge_dicts,
mimetype2ext,
month_by_name,
@@ -160,10 +161,12 @@ class TestUtil(unittest.TestCase):
sanitize_filename('New World record at 0:12:34'),
'New World record at 0_12_34')
self.assertEqual(sanitize_filename('--gasdgf'), '_-gasdgf')
self.assertEqual(sanitize_filename('--gasdgf'), '--gasdgf')
self.assertEqual(sanitize_filename('--gasdgf', is_id=True), '--gasdgf')
self.assertEqual(sanitize_filename('.gasdgf'), 'gasdgf')
self.assertEqual(sanitize_filename('--gasdgf', is_id=False), '_-gasdgf')
self.assertEqual(sanitize_filename('.gasdgf'), '.gasdgf')
self.assertEqual(sanitize_filename('.gasdgf', is_id=True), '.gasdgf')
self.assertEqual(sanitize_filename('.gasdgf', is_id=False), 'gasdgf')
forbidden = '"\0\\/'
for fc in forbidden:
@@ -625,6 +628,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('3h 11m 53s'), 11513)
self.assertEqual(parse_duration('3 hours 11 minutes 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours 11 mins 53 secs'), 11513)
self.assertEqual(parse_duration('3 hours, 11 minutes, 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours, 11 mins, 53 secs'), 11513)
self.assertEqual(parse_duration('62m45s'), 3765)
self.assertEqual(parse_duration('6m59s'), 419)
self.assertEqual(parse_duration('49s'), 49)
@@ -1780,6 +1785,7 @@ Line 1
self.assertEqual(format_bytes(1024**6), '1.00EiB')
self.assertEqual(format_bytes(1024**7), '1.00ZiB')
self.assertEqual(format_bytes(1024**8), '1.00YiB')
self.assertEqual(format_bytes(1024**9), '1024.00YiB')
def test_hide_login_info(self):
self.assertEqual(Config.hide_login_info(['-u', 'foo', '-p', 'bar']),
@@ -1790,6 +1796,36 @@ Line 1
self.assertEqual(Config.hide_login_info(['--username=foo']),
['--username=PRIVATE'])
def test_locked_file(self):
TEXT = 'test_locked_file\n'
FILE = 'test_locked_file.ytdl'
MODES = 'war' # Order is important
try:
for lock_mode in MODES:
with locked_file(FILE, lock_mode, False) as f:
if lock_mode == 'r':
self.assertEqual(f.read(), TEXT * 2, 'Wrong file content')
else:
f.write(TEXT)
for test_mode in MODES:
testing_write = test_mode != 'r'
try:
with locked_file(FILE, test_mode, False):
pass
except (BlockingIOError, PermissionError):
if not testing_write: # FIXME
print(f'Known issue: Exclusive lock ({lock_mode}) blocks read access ({test_mode})')
continue
self.assertTrue(testing_write, f'{test_mode} is blocked by {lock_mode}')
else:
self.assertFalse(testing_write, f'{test_mode} is not blocked by {lock_mode}')
finally:
try:
os.remove(FILE)
except Exception:
pass
if __name__ == '__main__':
unittest.main()

View File

@@ -90,6 +90,10 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/e06dea74/player_ias.vflset/en_US/base.js',
'AiuodmaDDYw8d3y4bf', 'ankd8eza2T6Qmw',
),
(
'https://www.youtube.com/s/player/5dd88d1d/player-plasma-ias-phone-en_US.vflset/base.js',
'kSxKFLeqzv_ZyHSAt', 'n8gS8oRlHOxPFA',
),
]

View File

@@ -32,6 +32,7 @@ from string import ascii_letters
from .compat import (
compat_basestring,
compat_brotli,
compat_get_terminal_size,
compat_kwargs,
compat_numeric_types,
@@ -64,6 +65,7 @@ from .utils import (
ExistingVideoReached,
expand_path,
ExtractorError,
filter_dict,
float_or_none,
format_bytes,
format_field,
@@ -71,6 +73,7 @@ from .utils import (
formatSeconds,
GeoRestrictedError,
get_domain,
has_certifi,
HEADRequest,
InAdvancePagedList,
int_or_none,
@@ -83,7 +86,9 @@ from .utils import (
make_dir,
make_HTTPS_handler,
MaxDownloadsReached,
merge_headers,
network_exceptions,
NO_DEFAULT,
number_of_digits,
orderedSet,
OUTTMPL_TYPES,
@@ -233,6 +238,8 @@ class YoutubeDL(object):
See "Sorting Formats" for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats"
for more details.
prefer_free_formats: Whether to prefer video formats with free containers
over non-free ones of same quality.
allow_multiple_video_streams: Allow multiple video streams to be merged
into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged
@@ -332,6 +339,7 @@ class YoutubeDL(object):
nocheckcertificate: Do not verify SSL certificates
prefer_insecure: Use HTTP instead of HTTPS to retrieve information.
At the moment, this is only supported by YouTube.
http_headers: A dictionary of custom headers to be used for all requests
proxy: URL of the proxy server to use
geo_verification_proxy: URL of the proxy to use for IP address verification
on geo-restricted sites.
@@ -507,23 +515,22 @@ class YoutubeDL(object):
'track_number', 'disc_number', 'release_year',
))
_format_fields = {
# NB: Keep in sync with the docstring of extractor/common.py
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
'width', 'height', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start',
'preference', 'language', 'language_preference', 'quality', 'source_preference',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'downloader_options',
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time'
}
_format_selection_exts = {
'audio': {'m4a', 'mp3', 'ogg', 'aac'},
'video': {'mp4', 'flv', 'webm', '3gp'},
'storyboards': {'mhtml'},
}
params = None
_ies = {}
_pps = {k: [] for k in POSTPROCESS_WHEN}
_printed_messages = set()
_first_webpage_request = True
_download_retcode = None
_num_downloads = None
_playlist_level = 0
_playlist_urls = set()
_screen_file = None
def __init__(self, params=None, auto_init=True):
"""Create a FileDownloader object with the given options.
@param auto_init Whether to load the default extractors and print header (if verbose).
@@ -531,6 +538,7 @@ class YoutubeDL(object):
"""
if params is None:
params = {}
self.params = params
self._ies = {}
self._ies_instances = {}
self._pps = {k: [] for k in POSTPROCESS_WHEN}
@@ -542,15 +550,21 @@ class YoutubeDL(object):
self._download_retcode = 0
self._num_downloads = 0
self._num_videos = 0
self._screen_file = [sys.stdout, sys.stderr][params.get('logtostderr', False)]
self._err_file = sys.stderr
self.params = params
self._playlist_level = 0
self._playlist_urls = set()
self.cache = Cache(self)
windows_enable_vt_mode()
self._out_files = {
'error': sys.stderr,
'print': sys.stderr if self.params.get('logtostderr') else sys.stdout,
'console': None if compat_os_name == 'nt' else next(
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None)
}
self._out_files['screen'] = sys.stderr if self.params.get('quiet') else self._out_files['print']
self._allow_colors = {
'screen': not self.params.get('no_color') and supports_terminal_sequences(self._screen_file),
'err': not self.params.get('no_color') and supports_terminal_sequences(self._err_file),
type_: not self.params.get('no_color') and supports_terminal_sequences(self._out_files[type_])
for type_ in ('screen', 'error')
}
if sys.version_info < (3, 6):
@@ -615,7 +629,7 @@ class YoutubeDL(object):
sp_kwargs = dict(
stdin=subprocess.PIPE,
stdout=slave,
stderr=self._err_file)
stderr=self._out_files['error'])
try:
self._output_process = Popen(['bidiv'] + width_args, **sp_kwargs)
except OSError:
@@ -647,6 +661,9 @@ class YoutubeDL(object):
else self.params['format'] if callable(self.params['format'])
else self.build_format_selector(self.params['format']))
# Set http_headers defaults according to std_headers
self.params['http_headers'] = merge_headers(std_headers, self.params.get('http_headers', {}))
self._setup_opener()
if auto_init:
@@ -780,14 +797,24 @@ class YoutubeDL(object):
self._printed_messages.add(message)
write_string(message, out=out, encoding=self.params.get('encoding'))
def to_stdout(self, message, skip_eol=False, quiet=False):
def to_stdout(self, message, skip_eol=False, quiet=None):
"""Print message to stdout"""
if self.params.get('logger'):
self.params['logger'].debug(message)
elif not quiet or self.params.get('verbose'):
if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead')
self._write_string(
'%s%s' % (self._bidi_workaround(message), ('' if skip_eol else '\n')),
self._err_file if quiet else self._screen_file)
self._out_files['print'])
def to_screen(self, message, skip_eol=False, quiet=None):
"""Print message to screen if not in quiet mode"""
if self.params.get('logger'):
self.params['logger'].debug(message)
return
if (self.params.get('quiet') if quiet is None else quiet) and not self.params.get('verbose'):
return
self._write_string(
'%s%s' % (self._bidi_workaround(message), ('' if skip_eol else '\n')),
self._out_files['screen'])
def to_stderr(self, message, only_once=False):
"""Print message to stderr"""
@@ -795,7 +822,12 @@ class YoutubeDL(object):
if self.params.get('logger'):
self.params['logger'].error(message)
else:
self._write_string('%s\n' % self._bidi_workaround(message), self._err_file, only_once=only_once)
self._write_string('%s\n' % self._bidi_workaround(message), self._out_files['error'], only_once=only_once)
def _send_console_code(self, code):
if compat_os_name == 'nt' or not self._out_files['console']:
return
self._write_string(code, self._out_files['console'])
def to_console_title(self, message):
if not self.params.get('consoletitle', False):
@@ -806,26 +838,18 @@ class YoutubeDL(object):
# c_wchar_p() might not be necessary if `message` is
# already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
elif 'TERM' in os.environ:
self._write_string('\033]0;%s\007' % message, self._screen_file)
else:
self._send_console_code(f'\033]0;{message}\007')
def save_console_title(self):
if not self.params.get('consoletitle', False):
if not self.params.get('consoletitle') or self.params.get('simulate'):
return
if self.params.get('simulate'):
return
if compat_os_name != 'nt' and 'TERM' in os.environ:
# Save the title on stack
self._write_string('\033[22;0t', self._screen_file)
self._send_console_code('\033[22;0t') # Save the title on stack
def restore_console_title(self):
if not self.params.get('consoletitle', False):
if not self.params.get('consoletitle') or self.params.get('simulate'):
return
if self.params.get('simulate'):
return
if compat_os_name != 'nt' and 'TERM' in os.environ:
# Restore the title from stack
self._write_string('\033[23;0t', self._screen_file)
self._send_console_code('\033[23;0t') # Restore the title from stack
def __enter__(self):
self.save_console_title()
@@ -871,11 +895,6 @@ class YoutubeDL(object):
raise DownloadError(message, exc_info)
self._download_retcode = 1
def to_screen(self, message, skip_eol=False):
"""Print message to stdout if not in quiet mode"""
self.to_stdout(
message, skip_eol, quiet=self.params.get('quiet', False))
class Styles(Enum):
HEADERS = 'yellow'
EMPHASIS = 'light blue'
@@ -888,7 +907,8 @@ class YoutubeDL(object):
def _format_text(self, handle, allow_colors, text, f, fallback=None, *, test_encoding=False):
if test_encoding:
original_text = text
encoding = self.params.get('encoding') or getattr(handle, 'encoding', 'ascii')
# handle.encoding can be None. See https://github.com/yt-dlp/yt-dlp/issues/2711
encoding = self.params.get('encoding') or getattr(handle, 'encoding', None) or 'ascii'
text = text.encode(encoding, 'ignore').decode(encoding)
if fallback is not None and text != original_text:
text = fallback
@@ -898,11 +918,11 @@ class YoutubeDL(object):
def _format_screen(self, *args, **kwargs):
return self._format_text(
self._screen_file, self._allow_colors['screen'], *args, **kwargs)
self._out_files['screen'], self._allow_colors['screen'], *args, **kwargs)
def _format_err(self, *args, **kwargs):
return self._format_text(
self._err_file, self._allow_colors['err'], *args, **kwargs)
self._out_files['error'], self._allow_colors['error'], *args, **kwargs)
def report_warning(self, message, only_once=False):
'''
@@ -918,7 +938,7 @@ class YoutubeDL(object):
def deprecation_warning(self, message):
if self.params.get('logger') is not None:
self.params['logger'].warning('DeprecationWarning: {message}')
self.params['logger'].warning(f'DeprecationWarning: {message}')
else:
self.to_stderr(f'{self._format_err("DeprecationWarning:", self.Styles.ERROR)} {message}', True)
@@ -953,13 +973,13 @@ class YoutubeDL(object):
except UnicodeEncodeError:
self.to_screen('Deleting existing file')
def raise_no_formats(self, info, forced=False):
def raise_no_formats(self, info, forced=False, *, msg=None):
has_drm = info.get('__has_drm')
msg = 'This video is DRM protected' if has_drm else 'No video formats found!'
expected = self.params.get('ignore_no_formats_error')
if forced or not expected:
ignored, expected = self.params.get('ignore_no_formats_error'), bool(msg)
msg = msg or has_drm and 'This video is DRM protected' or 'No video formats found!'
if forced or not ignored:
raise ExtractorError(msg, video_id=info['id'], ie=info['extractor'],
expected=has_drm or expected)
expected=has_drm or ignored or expected)
else:
self.report_warning(msg)
@@ -1036,8 +1056,7 @@ class YoutubeDL(object):
@staticmethod
def _copy_infodict(info_dict):
info_dict = dict(info_dict)
for key in ('__original_infodict', '__postprocessors'):
info_dict.pop(key, None)
info_dict.pop('__postprocessors', None)
return info_dict
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
@@ -1082,10 +1101,11 @@ class YoutubeDL(object):
(?P<fields>{field})
(?P<maths>(?:{math_op}{math_field})*)
(?:>(?P<strf_format>.+?))?
(?P<remaining>
(?P<alternate>(?<!\\),[^|&)]+)?
(?:&(?P<replacement>.*?))?
(?:\|(?P<default>.*?))?
$'''.format(field=FIELD_RE, math_op=MATH_OPERATORS_RE, math_field=MATH_FIELD_RE))
)$'''.format(field=FIELD_RE, math_op=MATH_OPERATORS_RE, math_field=MATH_FIELD_RE))
def _traverse_infodict(k):
k = k.split('.')
@@ -1132,8 +1152,10 @@ class YoutubeDL(object):
na = self.params.get('outtmpl_na_placeholder', 'NA')
def filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames')):
return sanitize_filename(str(value), restricted=restricted,
is_id=re.search(r'(^|[_.])id(\.|$)', key))
return sanitize_filename(str(value), restricted=restricted, is_id=(
bool(re.search(r'(^|[_.])id(\.|$)', key))
if 'filename-sanitization' in self.params.get('compat_opts', [])
else NO_DEFAULT))
sanitizer = sanitize if callable(sanitize) else filename_sanitizer
sanitize = bool(sanitize)
@@ -1156,7 +1178,7 @@ class YoutubeDL(object):
value = get_value(mobj)
replacement = mobj['replacement']
if value is None and mobj['alternate']:
mobj = re.match(INTERNAL_FORMAT_RE, mobj['alternate'][1:])
mobj = re.match(INTERNAL_FORMAT_RE, mobj['remaining'][1:])
else:
break
@@ -1218,18 +1240,21 @@ class YoutubeDL(object):
outtmpl, info_dict = self.prepare_outtmpl(outtmpl, info_dict, *args, **kwargs)
return self.escape_outtmpl(outtmpl) % info_dict
def _prepare_filename(self, info_dict, tmpl_type='default'):
def _prepare_filename(self, info_dict, *, outtmpl=None, tmpl_type=None):
assert None in (outtmpl, tmpl_type), 'outtmpl and tmpl_type are mutually exclusive'
if outtmpl is None:
outtmpl = self.outtmpl_dict.get(tmpl_type or 'default', self.outtmpl_dict['default'])
try:
outtmpl = self._outtmpl_expandpath(self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default']))
outtmpl = self._outtmpl_expandpath(outtmpl)
filename = self.evaluate_outtmpl(outtmpl, info_dict, True)
if not filename:
return None
if tmpl_type in ('default', 'temp'):
if tmpl_type in ('', 'temp'):
final_ext, ext = self.params.get('final_ext'), info_dict.get('ext')
if final_ext and ext and final_ext != ext and filename.endswith(f'.{final_ext}'):
filename = replace_extension(filename, ext, final_ext)
else:
elif tmpl_type:
force_ext = OUTTMPL_TYPES[tmpl_type]
if force_ext:
filename = replace_extension(filename, force_ext, info_dict.get('ext'))
@@ -1245,10 +1270,12 @@ class YoutubeDL(object):
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
return None
def prepare_filename(self, info_dict, dir_type='', warn=False):
"""Generate the output filename."""
filename = self._prepare_filename(info_dict, dir_type or 'default')
def prepare_filename(self, info_dict, dir_type='', *, outtmpl=None, warn=False):
"""Generate the output filename"""
if outtmpl:
assert not dir_type, 'outtmpl and dir_type are mutually exclusive'
dir_type = None
filename = self._prepare_filename(info_dict, tmpl_type=dir_type, outtmpl=outtmpl)
if not filename and dir_type not in ('', 'temp'):
return ''
@@ -1422,7 +1449,7 @@ class YoutubeDL(object):
min_wait, max_wait = self.params.get('wait_for_video')
diff = try_get(ie_result, lambda x: x['release_timestamp'] - time.time())
if diff is None and ie_result.get('live_status') == 'is_upcoming':
diff = random.randrange(min_wait, max_wait) if (max_wait and min_wait) else (max_wait or min_wait)
diff = round(random.uniform(min_wait, max_wait) if (max_wait and min_wait) else (max_wait or min_wait), 0)
self.report_warning('Release time of video is not known')
elif (diff or 0) <= 0:
self.report_warning('Video should already be available according to extracted info')
@@ -1471,8 +1498,12 @@ class YoutubeDL(object):
self.add_extra_info(ie_result, {
'webpage_url': url,
'original_url': url,
'webpage_url_basename': url_basename(url),
'webpage_url_domain': get_domain(url),
})
webpage_url = ie_result.get('webpage_url')
if webpage_url:
self.add_extra_info(ie_result, {
'webpage_url_basename': url_basename(webpage_url),
'webpage_url_domain': get_domain(webpage_url),
})
if ie is not None:
self.add_extra_info(ie_result, {
@@ -1549,13 +1580,9 @@ class YoutubeDL(object):
if not info:
return info
force_properties = dict(
(k, v) for k, v in ie_result.items() if v is not None)
for f in ('_type', 'url', 'id', 'extractor', 'extractor_key', 'ie_key'):
if f in force_properties:
del force_properties[f]
new_result = info.copy()
new_result.update(force_properties)
new_result.update(filter_dict(ie_result, lambda k, v: (
v is not None and k not in {'_type', 'url', 'id', 'extractor', 'extractor_key', 'ie_key'})))
# Extracted info may not be a video result (i.e.
# info.get('_type', 'video') != video) but rather an url or
@@ -1580,6 +1607,7 @@ class YoutubeDL(object):
self._playlist_level += 1
self._playlist_urls.add(webpage_url)
self._fill_common_fields(ie_result, False)
self._sanitize_thumbnails(ie_result)
try:
return self.__process_playlist(ie_result, download)
@@ -1792,7 +1820,7 @@ class YoutubeDL(object):
ie_result['entries'] = playlist_results
# Write the updated info to json
if _infojson_written and self._write_info_json(
if _infojson_written is True and self._write_info_json(
'updated playlist', ie_result,
self.prepare_filename(ie_copy, 'pl_infojson'), overwrite=True) is None:
return
@@ -1842,15 +1870,21 @@ class YoutubeDL(object):
'^=': lambda attr, value: attr.startswith(value),
'$=': lambda attr, value: attr.endswith(value),
'*=': lambda attr, value: value in attr,
'~=': lambda attr, value: value.search(attr) is not None
}
str_operator_rex = re.compile(r'''(?x)\s*
(?P<key>[a-zA-Z0-9._-]+)\s*
(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
(?P<value>[a-zA-Z0-9._-]+)\s*
(?P<negation>!\s*)?(?P<op>%s)\s*(?P<none_inclusive>\?\s*)?
(?P<quote>["'])?
(?P<value>(?(quote)(?:(?!(?P=quote))[^\\]|\\.)+|[\w.-]+))
(?(quote)(?P=quote))\s*
''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.fullmatch(filter_spec)
if m:
comparison_value = m.group('value')
if m.group('op') == '~=':
comparison_value = re.compile(m.group('value'))
else:
comparison_value = re.sub(r'''\\([\\"'])''', r'\1', m.group('value'))
str_op = STR_OPERATORS[m.group('op')]
if m.group('negation'):
op = lambda attr, value: not str_op(attr, value)
@@ -2145,7 +2179,8 @@ class YoutubeDL(object):
yield from _check_formats(ctx['formats'][::-1])
elif format_spec == 'mergeall':
def selector_function(ctx):
formats = list(_check_formats(ctx['formats']))
formats = list(_check_formats(
f for f in ctx['formats'] if f.get('vcodec') != 'none' or f.get('acodec') != 'none'))
if not formats:
return
merged_format = formats[-1]
@@ -2154,7 +2189,7 @@ class YoutubeDL(object):
yield merged_format
else:
format_fallback, format_reverse, format_idx = False, True, 1
format_fallback, seperate_fallback, format_reverse, format_idx = False, None, True, 1
mobj = re.match(
r'(?P<bw>best|worst|b|w)(?P<type>video|audio|v|a)?(?P<mod>\*)?(?:\.(?P<n>[1-9]\d*))?$',
format_spec)
@@ -2181,6 +2216,7 @@ class YoutubeDL(object):
filter_f = lambda f: f.get('ext') == format_spec and f.get('acodec') != 'none'
elif format_spec in self._format_selection_exts['video']:
filter_f = lambda f: f.get('ext') == format_spec and f.get('acodec') != 'none' and f.get('vcodec') != 'none'
seperate_fallback = lambda f: f.get('ext') == format_spec and f.get('vcodec') != 'none'
elif format_spec in self._format_selection_exts['storyboards']:
filter_f = lambda f: f.get('ext') == format_spec and f.get('acodec') == 'none' and f.get('vcodec') == 'none'
else:
@@ -2189,11 +2225,15 @@ class YoutubeDL(object):
def selector_function(ctx):
formats = list(ctx['formats'])
matches = list(filter(filter_f, formats)) if filter_f is not None else formats
if format_fallback and ctx['incomplete_formats'] and not matches:
if not matches:
if format_fallback and ctx['incomplete_formats']:
# for extractors with incomplete formats (audio only (soundcloud)
# or video only (imgur)) best/worst will fallback to
# best/worst {video,audio}-only format
matches = formats
elif seperate_fallback and not ctx['has_merged_format']:
# for compatibility with youtube-dl when there is no pre-merged format
matches = list(filter(seperate_fallback, formats))
matches = LazyList(_check_formats(matches[::-1 if format_reverse else 1]))
try:
yield matches[format_idx - 1]
@@ -2239,8 +2279,7 @@ class YoutubeDL(object):
return _build_selector_function(parsed_selector)
def _calc_headers(self, info_dict):
res = std_headers.copy()
res.update(info_dict.get('http_headers') or {})
res = merge_headers(self.params['http_headers'], info_dict.get('http_headers') or {})
cookies = self._calc_cookies(info_dict)
if cookies:
@@ -2298,15 +2337,10 @@ class YoutubeDL(object):
else:
info_dict['thumbnails'] = thumbnails
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
self._num_videos += 1
if 'id' not in info_dict:
raise ExtractorError('Missing "id" field in extractor result', ie=info_dict['extractor'])
elif not info_dict.get('id'):
raise ExtractorError('Extractor failed to obtain "id"', ie=info_dict['extractor'])
def _fill_common_fields(self, info_dict, is_video=True):
# TODO: move sanitization here
if is_video:
# playlists are allowed to lack "title"
info_dict['fulltitle'] = info_dict.get('title')
if 'title' not in info_dict:
raise ExtractorError('Missing "title" field in extractor result',
@@ -2315,46 +2349,6 @@ class YoutubeDL(object):
self.report_warning('Extractor failed to obtain "title". Creating a generic title instead')
info_dict['title'] = f'{info_dict["extractor"]} video #{info_dict["id"]}'
def report_force_conversion(field, field_not, conversion):
self.report_warning(
'"%s" field is not %s - forcing %s conversion, there is an error in extractor'
% (field, field_not, conversion))
def sanitize_string_field(info, string_field):
field = info.get(string_field)
if field is None or isinstance(field, compat_str):
return
report_force_conversion(string_field, 'a string', 'string')
info[string_field] = compat_str(field)
def sanitize_numeric_fields(info):
for numeric_field in self._NUMERIC_FIELDS:
field = info.get(numeric_field)
if field is None or isinstance(field, compat_numeric_types):
continue
report_force_conversion(numeric_field, 'numeric', 'int')
info[numeric_field] = int_or_none(field)
sanitize_string_field(info_dict, 'id')
sanitize_numeric_fields(info_dict)
if 'playlist' not in info_dict:
# It isn't part of a playlist
info_dict['playlist'] = None
info_dict['playlist_index'] = None
self._sanitize_thumbnails(info_dict)
thumbnail = info_dict.get('thumbnail')
thumbnails = info_dict.get('thumbnails')
if thumbnail:
info_dict['thumbnail'] = sanitize_url(thumbnail)
elif thumbnails:
info_dict['thumbnail'] = thumbnails[-1]['url']
if info_dict.get('display_id') is None and 'id' in info_dict:
info_dict['display_id'] = info_dict['id']
if info_dict.get('duration') is not None:
info_dict['duration_string'] = formatSeconds(info_dict['duration'])
@@ -2395,6 +2389,59 @@ class YoutubeDL(object):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
self._num_videos += 1
if 'id' not in info_dict:
raise ExtractorError('Missing "id" field in extractor result', ie=info_dict['extractor'])
elif not info_dict.get('id'):
raise ExtractorError('Extractor failed to obtain "id"', ie=info_dict['extractor'])
def report_force_conversion(field, field_not, conversion):
self.report_warning(
'"%s" field is not %s - forcing %s conversion, there is an error in extractor'
% (field, field_not, conversion))
def sanitize_string_field(info, string_field):
field = info.get(string_field)
if field is None or isinstance(field, compat_str):
return
report_force_conversion(string_field, 'a string', 'string')
info[string_field] = compat_str(field)
def sanitize_numeric_fields(info):
for numeric_field in self._NUMERIC_FIELDS:
field = info.get(numeric_field)
if field is None or isinstance(field, compat_numeric_types):
continue
report_force_conversion(numeric_field, 'numeric', 'int')
info[numeric_field] = int_or_none(field)
sanitize_string_field(info_dict, 'id')
sanitize_numeric_fields(info_dict)
if (info_dict.get('duration') or 0) <= 0 and info_dict.pop('duration', None):
self.report_warning('"duration" field is negative, there is an error in extractor')
if 'playlist' not in info_dict:
# It isn't part of a playlist
info_dict['playlist'] = None
info_dict['playlist_index'] = None
self._sanitize_thumbnails(info_dict)
thumbnail = info_dict.get('thumbnail')
thumbnails = info_dict.get('thumbnails')
if thumbnail:
info_dict['thumbnail'] = sanitize_url(thumbnail)
elif thumbnails:
info_dict['thumbnail'] = thumbnails[-1]['url']
if info_dict.get('display_id') is None and 'id' in info_dict:
info_dict['display_id'] = info_dict['id']
self._fill_common_fields(info_dict)
for cc_kind in ('subtitles', 'automatic_captions'):
cc = info_dict.get(cc_kind)
if cc:
@@ -2420,12 +2467,21 @@ class YoutubeDL(object):
info_dict['__has_drm'] = any(f.get('has_drm') for f in formats)
if not self.params.get('allow_unplayable_formats'):
formats = [f for f in formats if not f.get('has_drm')]
if info_dict['__has_drm'] and all(
f.get('acodec') == f.get('vcodec') == 'none' for f in formats):
self.report_warning(
'This video is DRM protected and only images are available for download. '
'Use --list-formats to see them')
if info_dict.get('is_live'):
get_from_start = bool(self.params.get('live_from_start'))
formats = [f for f in formats if bool(f.get('is_from_start')) == get_from_start]
get_from_start = not info_dict.get('is_live') or bool(self.params.get('live_from_start'))
if not get_from_start:
info_dict['title'] += ' ' + datetime.datetime.now().strftime('%Y-%m-%d %H:%M')
if info_dict.get('is_live') and formats:
formats = [f for f in formats if bool(f.get('is_from_start')) == get_from_start]
if get_from_start and not formats:
self.raise_no_formats(info_dict, msg=(
'--live-from-start is passed, but there are no formats that can be downloaded from the start. '
'If you want to download from the current time, use --no-live-from-start'))
if not formats:
self.raise_no_formats(info_dict)
@@ -2501,8 +2557,6 @@ class YoutubeDL(object):
if '__x_forwarded_for_ip' in info_dict:
del info_dict['__x_forwarded_for_ip']
# TODO Central sorting goes here
if self.params.get('check_formats') is True:
formats = LazyList(self._check_formats(formats[::-1]), reverse=True)
@@ -2515,6 +2569,12 @@ class YoutubeDL(object):
info_dict, _ = self.pre_process(info_dict)
if self._match_entry(info_dict, incomplete=self._format_fields) is not None:
return info_dict
self.post_extract(info_dict)
info_dict, _ = self.pre_process(info_dict, 'after_filter')
# The pre-processors may have modified the formats
formats = info_dict.get('formats', [info_dict])
@@ -2551,33 +2611,15 @@ class YoutubeDL(object):
self.report_error(err, tb=False, is_error=False)
continue
# While in format selection we may need to have an access to the original
# format set in order to calculate some metrics or do some processing.
# For now we need to be able to guess whether original formats provided
# by extractor are incomplete or not (i.e. whether extractor provides only
# video-only or audio-only formats) for proper formats selection for
# extractors with such incomplete formats (see
# https://github.com/ytdl-org/youtube-dl/pull/5556).
# Since formats may be filtered during format selection and may not match
# the original formats the results may be incorrect. Thus original formats
# or pre-calculated metrics should be passed to format selection routines
# as well.
# We will pass a context object containing all necessary additional data
# instead of just formats.
# This fixes incorrect format selection issue (see
# https://github.com/ytdl-org/youtube-dl/issues/10083).
incomplete_formats = (
formats_to_download = list(format_selector({
'formats': formats,
'has_merged_format': any('none' not in (f.get('acodec'), f.get('vcodec')) for f in formats),
'incomplete_formats': (
# All formats are video-only or
all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats)
# all formats are audio-only
or all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats))
ctx = {
'formats': formats,
'incomplete_formats': incomplete_formats,
}
formats_to_download = list(format_selector(ctx))
or all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats)),
}))
if interactive_format_selection and not formats_to_download:
self.report_error('Requested format is not available', tb=False, is_error=False)
continue
@@ -2585,8 +2627,9 @@ class YoutubeDL(object):
if not formats_to_download:
if not self.params.get('ignore_no_formats_error'):
raise ExtractorError('Requested format is not available', expected=True,
video_id=info_dict['id'], ie=info_dict['extractor'])
raise ExtractorError(
'Requested format is not available. Use --list-formats for a list of available formats',
expected=True, video_id=info_dict['id'], ie=info_dict['extractor'])
self.report_warning('Requested format is not available')
# Process what we can, even without any available formats.
formats_to_download = [{}]
@@ -2599,15 +2642,12 @@ class YoutubeDL(object):
+ ', '.join([f['format_id'] for f in formats_to_download]))
max_downloads_reached = False
for i, fmt in enumerate(formats_to_download):
formats_to_download[i] = new_info = dict(info_dict)
# Save a reference to the original info_dict so that it can be modified in process_info if needed
formats_to_download[i] = new_info = self._copy_infodict(info_dict)
new_info.update(fmt)
new_info['__original_infodict'] = info_dict
try:
self.process_info(new_info)
except MaxDownloadsReached:
max_downloads_reached = True
new_info.pop('__original_infodict')
# Remove copied info
for key, val in tuple(new_info.items()):
if info_dict.get(key) == val:
@@ -2631,9 +2671,10 @@ class YoutubeDL(object):
def process_subtitles(self, video_id, normal_subtitles, automatic_captions):
"""Select the requested subtitles and their format"""
available_subs = {}
available_subs, normal_sub_langs = {}, []
if normal_subtitles and self.params.get('writesubtitles'):
available_subs.update(normal_subtitles)
normal_sub_langs = tuple(normal_subtitles.keys())
if automatic_captions and self.params.get('writeautomaticsub'):
for lang, cap_info in automatic_captions.items():
if lang not in available_subs:
@@ -2644,7 +2685,7 @@ class YoutubeDL(object):
available_subs):
return None
all_sub_langs = available_subs.keys()
all_sub_langs = tuple(available_subs.keys())
if self.params.get('allsubtitles', False):
requested_langs = all_sub_langs
elif self.params.get('subtitleslangs', False):
@@ -2652,12 +2693,15 @@ class YoutubeDL(object):
# given in subtitleslangs. See https://github.com/yt-dlp/yt-dlp/issues/1041
requested_langs = []
for lang_re in self.params.get('subtitleslangs'):
if lang_re == 'all':
requested_langs.extend(all_sub_langs)
continue
discard = lang_re[0] == '-'
if discard:
lang_re = lang_re[1:]
if lang_re == 'all':
if discard:
requested_langs = []
else:
requested_langs.extend(all_sub_langs)
continue
current_langs = filter(re.compile(lang_re + '$').match, all_sub_langs)
if discard:
for lang in current_langs:
@@ -2666,10 +2710,10 @@ class YoutubeDL(object):
else:
requested_langs.extend(current_langs)
requested_langs = orderedSet(requested_langs)
elif 'en' in available_subs:
requested_langs = ['en']
elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else:
requested_langs = [list(all_sub_langs)[0]]
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
if requested_langs:
self.write_debug('Downloading subtitles: %s' % ', '.join(requested_langs))
@@ -2718,9 +2762,10 @@ class YoutubeDL(object):
self.to_stdout(self.evaluate_outtmpl(format_tmpl(tmpl), info_copy))
for tmpl, file_tmpl in self.params['print_to_file'].get(key, []):
filename = self.evaluate_outtmpl(file_tmpl, info_dict)
filename = self.prepare_filename(info_dict, outtmpl=file_tmpl)
tmpl = format_tmpl(tmpl)
self.to_screen(f'[info] Writing {tmpl!r} to: {filename}')
if self._ensure_dir_exists(filename):
with io.open(filename, 'a', encoding='utf-8') as f:
f.write(self.evaluate_outtmpl(tmpl, info_copy) + '\n')
@@ -2743,7 +2788,7 @@ class YoutubeDL(object):
if info_dict.get('requested_formats') is not None:
# For RTMP URLs, also include the playpath
info_dict['urls'] = '\n'.join(f['url'] + f.get('play_path', '') for f in info_dict['requested_formats'])
elif 'url' in info_dict:
elif info_dict.get('url'):
info_dict['urls'] = info_dict['url'] + info_dict.get('play_path', '')
if (self.params.get('forcejson')
@@ -2811,7 +2856,7 @@ class YoutubeDL(object):
return None
def process_info(self, info_dict):
"""Process a single resolved IE result. (Modified it in-place)"""
"""Process a single resolved IE result. (Modifies it in-place)"""
assert info_dict.get('_type', 'video') == 'video'
original_infodict = info_dict
@@ -2819,10 +2864,13 @@ class YoutubeDL(object):
if 'format' not in info_dict and 'ext' in info_dict:
info_dict['format'] = info_dict['ext']
# This is mostly just for backward compatibility of process_info
# As a side-effect, this allows for format-specific filters
if self._match_entry(info_dict) is not None:
info_dict['__write_download_archive'] = 'ignore'
return
# Does nothing under normal operation - for backward compatibility of process_info
self.post_extract(info_dict)
self._num_downloads += 1
@@ -2893,9 +2941,11 @@ class YoutubeDL(object):
# Write internet shortcut files
def _write_link_file(link_type):
if 'webpage_url' not in info_dict:
self.report_error('Cannot write internet shortcut file because the "webpage_url" field is missing in the media information')
return False
url = try_get(info_dict['webpage_url'], iri_to_uri)
if not url:
self.report_warning(
f'Cannot write internet shortcut file because the actual URL of "{info_dict["webpage_url"]}" is unknown')
return True
linkfn = replace_extension(self.prepare_filename(info_dict, 'link'), link_type, info_dict.get('ext'))
if not self._ensure_dir_exists(encodeFilename(linkfn)):
return False
@@ -2906,7 +2956,7 @@ class YoutubeDL(object):
self.to_screen(f'[info] Writing internet shortcut (.{link_type}) to: {linkfn}')
with io.open(encodeFilename(to_high_limit_path(linkfn)), 'w', encoding='utf-8',
newline='\r\n' if link_type == 'url' else '\n') as linkfile:
template_vars = {'url': iri_to_uri(info_dict['webpage_url'])}
template_vars = {'url': url}
if link_type == 'desktop':
template_vars['filename'] = linkfn[:-(len(link_type) + 1)]
linkfile.write(LINK_TEMPLATES[link_type] % template_vars)
@@ -3041,9 +3091,11 @@ class YoutubeDL(object):
'while also allowing unplayable formats to be downloaded. '
'The formats won\'t be merged to prevent data corruption.')
elif not merger.available:
self.report_warning(
'You have requested merging of multiple formats but ffmpeg is not installed. '
'The formats won\'t be merged.')
msg = 'You have requested merging of multiple formats but ffmpeg is not installed'
if not self.params.get('ignoreerrors'):
self.report_error(f'{msg}. Aborting due to --abort-on-error')
return
self.report_warning(f'{msg}. The formats won\'t be merged')
if temp_filename == '-':
reason = ('using a downloader other than ffmpeg' if FFmpegFD.can_merge_formats(info_dict, self.params)
@@ -3240,17 +3292,14 @@ class YoutubeDL(object):
return info_dict
info_dict.setdefault('epoch', int(time.time()))
info_dict.setdefault('_type', 'video')
remove_keys = {'__original_infodict'} # Always remove this since this may contain a copy of the entire dict
keep_keys = ['_type'] # Always keep this to facilitate load-info-json
if remove_private_keys:
remove_keys |= {
reject = lambda k, v: v is None or (k.startswith('_') and k != '_type') or k in {
'requested_downloads', 'requested_formats', 'requested_subtitles', 'requested_entries',
'entries', 'filepath', 'infojson_filename', 'original_url', 'playlist_autonumber',
}
reject = lambda k, v: k not in keep_keys and (
k.startswith('_') or k in remove_keys or v is None)
else:
reject = lambda k, v: k in remove_keys
reject = lambda k, v: False
def filter_fn(obj):
if isinstance(obj, dict):
@@ -3277,14 +3326,8 @@ class YoutubeDL(object):
actual_post_extract(video_dict or {})
return
post_extractor = info_dict.get('__post_extractor') or (lambda: {})
extra = post_extractor().items()
info_dict.update(extra)
info_dict.pop('__post_extractor', None)
original_infodict = info_dict.get('__original_infodict') or {}
original_infodict.update(extra)
original_infodict.pop('__post_extractor', None)
post_extractor = info_dict.pop('__post_extractor', None) or (lambda: {})
info_dict.update(post_extractor())
actual_post_extract(info_dict or {})
@@ -3562,7 +3605,7 @@ class YoutubeDL(object):
return
def get_encoding(stream):
ret = getattr(stream, 'encoding', 'missing (%s)' % type(stream).__name__)
ret = str(getattr(stream, 'encoding', 'missing (%s)' % type(stream).__name__))
if not supports_terminal_sequences(stream):
from .compat import WINDOWS_VT_MODE
ret += ' (No VT)' if WINDOWS_VT_MODE is False else ' (No ANSI)'
@@ -3571,7 +3614,7 @@ class YoutubeDL(object):
encoding_str = 'Encodings: locale %s, fs %s, out %s, err %s, pref %s' % (
locale.getpreferredencoding(),
sys.getfilesystemencoding(),
get_encoding(self._screen_file), get_encoding(self._err_file),
get_encoding(self._out_files['screen']), get_encoding(self._out_files['error']),
self.get_encoding())
logger = self.params.get('logger')
@@ -3645,6 +3688,8 @@ class YoutubeDL(object):
from .cookies import SQLITE_AVAILABLE, SECRETSTORAGE_AVAILABLE
lib_str = join_nonempty(
compat_brotli and compat_brotli.__name__,
has_certifi and 'certifi',
compat_pycrypto_AES and compat_pycrypto_AES.__name__.split('.')[0],
SECRETSTORAGE_AVAILABLE and 'secretstorage',
has_mutagen and 'mutagen',
@@ -3736,7 +3781,7 @@ class YoutubeDL(object):
return encoding
def _write_info_json(self, label, ie_result, infofn, overwrite=None):
''' Write infojson and returns True = written, False = skip, None = error '''
''' Write infojson and returns True = written, 'exists' = Already exists, False = skip, None = error '''
if overwrite is None:
overwrite = self.params.get('overwrites', True)
if not self.params.get('writeinfojson'):
@@ -3748,14 +3793,15 @@ class YoutubeDL(object):
return None
elif not overwrite and os.path.exists(infofn):
self.to_screen(f'[info] {label.title()} metadata is already present')
else:
return 'exists'
self.to_screen(f'[info] Writing {label} metadata as JSON to: {infofn}')
try:
write_json_file(self.sanitize_info(ie_result, self.params.get('clean_infojson', True)), infofn)
return True
except (OSError, IOError):
self.report_error(f'Cannot write {label} metadata to JSON file {infofn}')
return None
return True
def _write_description(self, label, ie_result, descfn):
''' Write description and returns True = written, False = skip, None = error '''
@@ -3826,9 +3872,12 @@ class YoutubeDL(object):
sub_info['filepath'] = sub_filename
ret.append((sub_filename, sub_filename_final))
except (DownloadError, ExtractorError, IOError, OSError, ValueError) + network_exceptions as err:
msg = f'Unable to download video subtitles for {sub_lang!r}: {err}'
if self.params.get('ignoreerrors') is not True: # False or 'only_download'
raise DownloadError(f'Unable to download video subtitles for {sub_lang!r}: {err}', err)
self.report_warning(f'Unable to download video subtitles for {sub_lang!r}: {err}')
if not self.params.get('ignoreerrors'):
self.report_error(msg)
raise DownloadError(msg)
self.report_warning(msg)
return ret
def _write_thumbnails(self, label, info_dict, filename, thumb_filename_base=None):
@@ -3860,7 +3909,7 @@ class YoutubeDL(object):
else:
self.to_screen(f'[info] Downloading {thumb_display_id} ...')
try:
uf = self.urlopen(t['url'])
uf = self.urlopen(sanitized_Request(t['url'], headers=t.get('http_headers', {})))
self.to_screen(f'[info] Writing {thumb_display_id} to: {thumb_filename}')
with open(encodeFilename(thumb_filename), 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf)

File diff suppressed because it is too large Load Diff

View File

@@ -134,6 +134,16 @@ except AttributeError:
asyncio.run = compat_asyncio_run
try: # >= 3.7
asyncio.tasks.all_tasks
except AttributeError:
asyncio.tasks.all_tasks = asyncio.tasks.Task.all_tasks
try:
import websockets as compat_websockets
except ImportError:
compat_websockets = None
# Python 3.8+ does not honor %HOME% on windows, but this breaks compatibility with youtube-dl
# See https://github.com/yt-dlp/yt-dlp/issues/792
# https://docs.python.org/3/library/os.path.html#os.path.expanduser
@@ -160,6 +170,13 @@ except ImportError:
except ImportError:
compat_pycrypto_AES = None
try:
import brotlicffi as compat_brotli
except ImportError:
try:
import brotli as compat_brotli
except ImportError:
compat_brotli = None
WINDOWS_VT_MODE = False if compat_os_name == 'nt' else None
@@ -248,6 +265,7 @@ __all__ = [
'compat_asyncio_run',
'compat_b64decode',
'compat_basestring',
'compat_brotli',
'compat_chr',
'compat_collections_abc',
'compat_cookiejar',
@@ -303,6 +321,7 @@ __all__ = [
'compat_urllib_response',
'compat_urlparse',
'compat_urlretrieve',
'compat_websockets',
'compat_xml_parse_error',
'compat_xpath',
'compat_zip',

View File

@@ -21,6 +21,7 @@ from .compat import (
compat_cookiejar_Cookie,
)
from .utils import (
error_to_str,
expand_path,
Popen,
YoutubeDLCookieJar,
@@ -453,6 +454,9 @@ def _extract_safari_cookies(profile, logger):
cookies_path = os.path.expanduser('~/Library/Cookies/Cookies.binarycookies')
if not os.path.isfile(cookies_path):
logger.debug('Trying secondary cookie location')
cookies_path = os.path.expanduser('~/Library/Containers/com.apple.Safari/Data/Library/Cookies/Cookies.binarycookies')
if not os.path.isfile(cookies_path):
raise FileNotFoundError('could not find safari cookies database')
@@ -718,7 +722,7 @@ def _get_kwallet_network_wallet(logger):
network_wallet = stdout.decode('utf-8').strip()
logger.debug('NetworkWallet = "{}"'.format(network_wallet))
return network_wallet
except BaseException as e:
except Exception as e:
logger.warning('exception while obtaining NetworkWallet: {}'.format(e))
return default_wallet
@@ -763,8 +767,8 @@ def _get_kwallet_password(browser_keyring_name, logger):
if stdout[-1:] == b'\n':
stdout = stdout[:-1]
return stdout
except BaseException as e:
logger.warning(f'exception running kwallet-query: {type(e).__name__}({e})')
except Exception as e:
logger.warning(f'exception running kwallet-query: {error_to_str(e)}')
return b''
@@ -820,8 +824,8 @@ def _get_mac_keyring_password(browser_keyring_name, logger):
if stdout[-1:] == b'\n':
stdout = stdout[:-1]
return stdout
except BaseException as e:
logger.warning(f'exception running find-generic-password: {type(e).__name__}({e})')
except Exception as e:
logger.warning(f'exception running find-generic-password: {error_to_str(e)}')
return None

View File

@@ -30,6 +30,7 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
from .common import FileDownloader
from .dash import DashSegmentsFD
from .f4m import F4mFD
from .fc2 import FC2LiveFD
from .hls import HlsFD
from .http import HttpFD
from .rtmp import RtmpFD
@@ -58,6 +59,7 @@ PROTOCOL_MAP = {
'ism': IsmFD,
'mhtml': MhtmlFD,
'niconico_dmc': NiconicoDmcFD,
'fc2_live': FC2LiveFD,
'websocket_frag': WebSocketFragmentFD,
'youtube_live_chat': YoutubeLiveChatFD,
'youtube_live_chat_replay': YoutubeLiveChatFD,
@@ -117,7 +119,7 @@ def _get_suitable_downloader(info_dict, protocol, params, default):
return FFmpegFD
elif (external_downloader or '').lower() == 'native':
return HlsFD
elif get_suitable_downloader(
elif protocol == 'm3u8_native' and get_suitable_downloader(
info_dict, params, None, protocol='m3u8_frag_urls', to_stdout=info_dict['to_stdout']):
return HlsFD
elif params.get('hls_prefer_native') is True:

View File

@@ -11,6 +11,7 @@ from ..utils import (
encodeFilename,
error_to_compat_str,
format_bytes,
LockingUnsupportedError,
sanitize_open,
shell_quote,
timeconvert,
@@ -159,7 +160,7 @@ class FileDownloader(object):
return int(round(number * multiplier))
def to_screen(self, *args, **kargs):
self.ydl.to_stdout(*args, quiet=self.params.get('quiet'), **kargs)
self.ydl.to_screen(*args, quiet=self.params.get('quiet'), **kargs)
def to_stderr(self, message):
self.ydl.to_stderr(message)
@@ -210,28 +211,44 @@ class FileDownloader(object):
def ytdl_filename(self, filename):
return filename + '.ytdl'
def sanitize_open(self, filename, open_mode):
file_access_retries = self.params.get('file_access_retries', 10)
def wrap_file_access(action, *, fatal=False):
def outer(func):
def inner(self, *args, **kwargs):
file_access_retries = self.params.get('file_access_retries', 0)
retry = 0
while True:
try:
return sanitize_open(filename, open_mode)
return func(self, *args, **kwargs)
except (IOError, OSError) as err:
retry = retry + 1
if retry > file_access_retries or err.errno not in (errno.EACCES,):
if retry > file_access_retries or err.errno not in (errno.EACCES, errno.EINVAL):
if not fatal:
self.report_error(f'unable to {action} file: {err}')
return
raise
self.to_screen(
'[download] Got file access error. Retrying (attempt %d of %s) ...'
% (retry, self.format_retries(file_access_retries)))
f'[download] Unable to {action} file due to file access error. '
f'Retrying (attempt {retry} of {self.format_retries(file_access_retries)}) ...')
time.sleep(0.01)
return inner
return outer
@wrap_file_access('open', fatal=True)
def sanitize_open(self, filename, open_mode):
f, filename = sanitize_open(filename, open_mode)
if not getattr(f, 'locked', None):
self.write_debug(f'{LockingUnsupportedError.msg}. Proceeding without locking', only_once=True)
return f, filename
@wrap_file_access('remove')
def try_remove(self, filename):
os.remove(filename)
@wrap_file_access('rename')
def try_rename(self, old_filename, new_filename):
if old_filename == new_filename:
return
try:
os.replace(old_filename, new_filename)
except (IOError, OSError) as err:
self.report_error(f'unable to rename file: {err}')
def try_utime(self, filename, last_modified_hdr):
"""Try to set the last-modified time of the given file."""
@@ -264,9 +281,9 @@ class FileDownloader(object):
elif self.ydl.params.get('logger'):
self._multiline = MultilineLogger(self.ydl.params['logger'], lines)
elif self.params.get('progress_with_newline'):
self._multiline = BreaklineStatusPrinter(self.ydl._screen_file, lines)
self._multiline = BreaklineStatusPrinter(self.ydl._out_files['screen'], lines)
else:
self._multiline = MultilinePrinter(self.ydl._screen_file, lines, not self.params.get('quiet'))
self._multiline = MultilinePrinter(self.ydl._out_files['screen'], lines, not self.params.get('quiet'))
self._multiline.allow_colors = self._multiline._HAVE_FULLCAP and not self.params.get('no_color')
def _finish_multiline_status(self):

View File

@@ -13,6 +13,7 @@ from ..compat import (
)
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
from ..utils import (
classproperty,
cli_option,
cli_valueless_option,
cli_bool_option,
@@ -73,17 +74,23 @@ class ExternalFD(FragmentFD):
def get_basename(cls):
return cls.__name__[:-2].lower()
@classproperty
def EXE_NAME(cls):
return cls.get_basename()
@property
def exe(self):
return self.get_basename()
return self.EXE_NAME
@classmethod
def available(cls, path=None):
path = check_executable(path or cls.get_basename(), [cls.AVAILABLE_OPT])
if path:
path = check_executable(
cls.EXE_NAME if path in (None, cls.get_basename()) else path,
[cls.AVAILABLE_OPT])
if not path:
return False
cls.exe = path
return path
return False
@classmethod
def supports(cls, info_dict):
@@ -106,7 +113,7 @@ class ExternalFD(FragmentFD):
def _configuration_args(self, keys=None, *args, **kwargs):
return _configuration_args(
self.get_basename(), self.params.get('external_downloader_args'), self.get_basename(),
self.get_basename(), self.params.get('external_downloader_args'), self.EXE_NAME,
keys, *args, **kwargs)
def _call_downloader(self, tmpfilename, info_dict):
@@ -159,9 +166,9 @@ class ExternalFD(FragmentFD):
dest.write(decrypt_fragment(fragment, src.read()))
src.close()
if not self.params.get('keep_fragments', False):
os.remove(encodeFilename(fragment_filename))
self.try_remove(encodeFilename(fragment_filename))
dest.close()
os.remove(encodeFilename('%s.frag.urls' % tmpfilename))
self.try_remove(encodeFilename('%s.frag.urls' % tmpfilename))
return 0
@@ -169,7 +176,7 @@ class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '--location', '-o', tmpfilename]
cmd = [self.exe, '--location', '-o', tmpfilename, '--compressed']
if info_dict.get('http_headers') is not None:
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
@@ -219,7 +226,7 @@ class WgetFD(ExternalFD):
AVAILABLE_OPT = '--version'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies', '--compression=auto']
if info_dict.get('http_headers') is not None:
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
@@ -230,7 +237,10 @@ class WgetFD(ExternalFD):
retry[1] = '0'
cmd += retry
cmd += self._option('--bind-address', 'source_address')
cmd += self._option('--proxy', 'proxy')
proxy = self.params.get('proxy')
if proxy:
for var in ('http_proxy', 'https_proxy'):
cmd += ['--execute', '%s=%s' % (var, proxy)]
cmd += self._valueless_option('--no-check-certificate', 'nocheckcertificate')
cmd += self._configuration_args()
cmd += ['--', info_dict['url']]
@@ -253,7 +263,7 @@ class Aria2cFD(ExternalFD):
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c',
'--console-log-level=warn', '--summary-interval=0', '--download-result=hide',
'--file-allocation=none', '-x16', '-j16', '-s16']
'--http-accept-gzip=true', '--file-allocation=none', '-x16', '-j16', '-s16']
if 'fragments' in info_dict:
cmd += ['--allow-overwrite=true', '--allow-piece-length-change=true']
else:
@@ -303,10 +313,7 @@ class Aria2cFD(ExternalFD):
class HttpieFD(ExternalFD):
AVAILABLE_OPT = '--version'
@classmethod
def available(cls, path=None):
return super().available(path or 'http')
EXE_NAME = 'http'
def _make_cmd(self, tmpfilename, info_dict):
cmd = ['http', '--download', '--output', tmpfilename, info_dict['url']]
@@ -507,11 +514,13 @@ class AVconvFD(FFmpegFD):
pass
_BY_NAME = dict(
(klass.get_basename(), klass)
_BY_NAME = {
klass.get_basename(): klass
for name, klass in globals().items()
if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD')
)
}
_BY_EXE = {klass.EXE_NAME: klass for klass in _BY_NAME.values()}
def list_external_downloaders():
@@ -523,4 +532,4 @@ def get_external_downloader(external_downloader):
downloader . """
# Drop .exe extension on Windows
bn = os.path.splitext(os.path.basename(external_downloader))[0]
return _BY_NAME.get(bn)
return _BY_NAME.get(bn, _BY_EXE.get(bn))

41
yt_dlp/downloader/fc2.py Normal file
View File

@@ -0,0 +1,41 @@
from __future__ import division, unicode_literals
import threading
from .common import FileDownloader
from .external import FFmpegFD
class FC2LiveFD(FileDownloader):
"""
Downloads FC2 live without being stopped. <br>
Note, this is not a part of public API, and will be removed without notice.
DO NOT USE
"""
def real_download(self, filename, info_dict):
ws = info_dict['ws']
heartbeat_lock = threading.Lock()
heartbeat_state = [None, 1]
def heartbeat():
try:
heartbeat_state[1] += 1
ws.send('{"name":"heartbeat","arguments":{},"id":%d}' % heartbeat_state[1])
except Exception:
self.to_screen('[fc2:live] Heartbeat failed')
with heartbeat_lock:
heartbeat_state[0] = threading.Timer(30, heartbeat)
heartbeat_state[0]._daemonic = True
heartbeat_state[0].start()
heartbeat()
new_info_dict = info_dict.copy()
new_info_dict.update({
'ws': None,
'protocol': 'live_ffmpeg',
})
return FFmpegFD(self.ydl, self.params or {}).download(filename, new_info_dict)

View File

@@ -25,6 +25,7 @@ from ..utils import (
error_to_compat_str,
encodeFilename,
sanitized_Request,
traverse_obj,
)
@@ -132,14 +133,19 @@ class FragmentFD(FileDownloader):
}
success = ctx['dl'].download(fragment_filename, fragment_info_dict)
if not success:
return False, None
return False
if fragment_info_dict.get('filetime'):
ctx['fragment_filetime'] = fragment_info_dict.get('filetime')
ctx['fragment_filename_sanitized'] = fragment_filename
return True, self._read_fragment(ctx)
return True
def _read_fragment(self, ctx):
try:
down, frag_sanitized = self.sanitize_open(ctx['fragment_filename_sanitized'], 'rb')
except FileNotFoundError:
if ctx.get('live'):
return None
raise
ctx['fragment_filename_sanitized'] = frag_sanitized
frag_content = down.read()
down.close()
@@ -153,7 +159,7 @@ class FragmentFD(FileDownloader):
if self.__do_ytdl_file(ctx):
self._write_ytdl_file(ctx)
if not self.params.get('keep_fragments', False):
os.remove(encodeFilename(ctx['fragment_filename_sanitized']))
self.try_remove(encodeFilename(ctx['fragment_filename_sanitized']))
del ctx['fragment_filename_sanitized']
def _prepare_frag_download(self, ctx):
@@ -172,7 +178,7 @@ class FragmentFD(FileDownloader):
dl = HttpQuietDownloader(
self.ydl,
{
'continuedl': True,
'continuedl': self.params.get('continuedl', True),
'quiet': self.params.get('quiet'),
'noprogress': True,
'ratelimit': self.params.get('ratelimit'),
@@ -299,7 +305,7 @@ class FragmentFD(FileDownloader):
if self.__do_ytdl_file(ctx):
ytdl_filename = encodeFilename(self.ytdl_filename(ctx['filename']))
if os.path.isfile(ytdl_filename):
os.remove(ytdl_filename)
self.try_remove(ytdl_filename)
elapsed = time.time() - ctx['started']
if ctx['tmpfilename'] == '-':
@@ -382,6 +388,7 @@ class FragmentFD(FileDownloader):
max_workers = self.params.get('concurrent_fragment_downloads', 1)
if max_progress > 1:
self._prepare_multiline_status(max_progress)
is_live = any(traverse_obj(args, (..., 2, 'is_live'), default=[]))
def thread_func(idx, ctx, fragments, info_dict, tpe):
ctx['max_progress'] = max_progress
@@ -395,25 +402,43 @@ class FragmentFD(FileDownloader):
def __exit__(self, exc_type, exc_val, exc_tb):
pass
spins = []
if compat_os_name == 'nt':
self.report_warning('Ctrl+C does not work on Windows when used with parallel threads. '
'This is a known issue and patches are welcome')
def future_result(future):
while True:
try:
return future.result(0.1)
except KeyboardInterrupt:
raise
except concurrent.futures.TimeoutError:
continue
else:
def future_result(future):
return future.result()
def interrupt_trigger_iter(fg):
for f in fg:
if not interrupt_trigger[0]:
break
yield f
spins = []
for idx, (ctx, fragments, info_dict) in enumerate(args):
tpe = FTPE(math.ceil(max_workers / max_progress))
job = tpe.submit(thread_func, idx, ctx, fragments, info_dict, tpe)
job = tpe.submit(thread_func, idx, ctx, interrupt_trigger_iter(fragments), info_dict, tpe)
spins.append((tpe, job))
result = True
for tpe, job in spins:
try:
result = result and job.result()
result = result and future_result(job)
except KeyboardInterrupt:
interrupt_trigger[0] = False
finally:
tpe.shutdown(wait=True)
if not interrupt_trigger[0]:
if not interrupt_trigger[0] and not is_live:
raise KeyboardInterrupt()
# we expect the user wants to stop and DO WANT the preceding postprocessors to run;
# so returning a intermediate result here instead of KeyboardInterrupt on live
return result
def download_and_append_fragments(
@@ -431,24 +456,23 @@ class FragmentFD(FileDownloader):
pack_func = lambda frag_content, _: frag_content
def download_fragment(fragment, ctx):
if not interrupt_trigger[0]:
return
frag_index = ctx['fragment_index'] = fragment['frag_index']
ctx['last_error'] = None
if not interrupt_trigger[0]:
return False, frag_index
headers = info_dict.get('http_headers', {}).copy()
byte_range = fragment.get('byte_range')
if byte_range:
headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'] - 1)
# Never skip the first fragment
fatal = is_fatal(fragment.get('index') or (frag_index - 1))
count, frag_content = 0, None
fatal, count = is_fatal(fragment.get('index') or (frag_index - 1)), 0
while count <= fragment_retries:
try:
success, frag_content = self._download_fragment(ctx, fragment['url'], info_dict, headers)
if not success:
return False, frag_index
if self._download_fragment(ctx, fragment['url'], info_dict, headers):
break
return
except (compat_urllib_error.HTTPError, http.client.IncompleteRead) as err:
# Unavailable (possibly temporary) fragments may be served.
# First we try to retry then either skip or abort.
@@ -465,25 +489,19 @@ class FragmentFD(FileDownloader):
break
raise
if count > fragment_retries:
if not fatal:
return False, frag_index
if count > fragment_retries and fatal:
ctx['dest_stream'].close()
self.report_error('Giving up after %s fragment retries' % fragment_retries)
return False, frag_index
return frag_content, frag_index
def append_fragment(frag_content, frag_index, ctx):
if not frag_content:
if not is_fatal(frag_index - 1):
if frag_content:
self._append_fragment(ctx, pack_func(frag_content, frag_index))
elif not is_fatal(frag_index - 1):
self.report_skip_fragment(frag_index, 'fragment not found')
return True
else:
ctx['dest_stream'].close()
self.report_error(
'fragment %s not found, unable to continue' % frag_index)
self.report_error(f'fragment {frag_index} not found, unable to continue')
return False
self._append_fragment(ctx, pack_func(frag_content, frag_index))
return True
decrypt_fragment = self.decrypter(info_dict)
@@ -494,25 +512,23 @@ class FragmentFD(FileDownloader):
def _download_fragment(fragment):
ctx_copy = ctx.copy()
frag_content, frag_index = download_fragment(fragment, ctx_copy)
return fragment, frag_content, frag_index, ctx_copy.get('fragment_filename_sanitized')
download_fragment(fragment, ctx_copy)
return fragment, fragment['frag_index'], ctx_copy.get('fragment_filename_sanitized')
self.report_warning('The download speed shown is only of one thread. This is a known issue and patches are welcome')
with tpe or concurrent.futures.ThreadPoolExecutor(max_workers) as pool:
for fragment, frag_content, frag_index, frag_filename in pool.map(_download_fragment, fragments):
if not interrupt_trigger[0]:
break
for fragment, frag_index, frag_filename in pool.map(_download_fragment, fragments):
ctx['fragment_filename_sanitized'] = frag_filename
ctx['fragment_index'] = frag_index
result = append_fragment(decrypt_fragment(fragment, frag_content), frag_index, ctx)
result = append_fragment(decrypt_fragment(fragment, self._read_fragment(ctx)), frag_index, ctx)
if not result:
return False
else:
for fragment in fragments:
if not interrupt_trigger[0]:
break
frag_content, frag_index = download_fragment(fragment, ctx)
result = append_fragment(decrypt_fragment(fragment, frag_content), frag_index, ctx)
download_fragment(fragment, ctx)
result = append_fragment(decrypt_fragment(fragment, self._read_fragment(ctx)), fragment['frag_index'], ctx)
if not result:
return False

View File

@@ -1,28 +1,30 @@
from __future__ import unicode_literals
import errno
import os
import socket
import ssl
import time
import random
import re
from .common import FileDownloader
from ..compat import (
compat_str,
compat_urllib_error,
compat_http_client
)
from ..utils import (
ContentTooShortError,
encodeFilename,
int_or_none,
parse_http_range,
sanitized_Request,
ThrottledDownload,
try_call,
write_xattr,
XAttrMetadataError,
XAttrUnavailableError,
)
RESPONSE_READ_EXCEPTIONS = (TimeoutError, ConnectionError, ssl.SSLError, compat_http_client.HTTPException)
class HttpFD(FileDownloader):
def real_download(self, filename, info_dict):
@@ -53,11 +55,11 @@ class HttpFD(FileDownloader):
ctx.open_mode = 'wb'
ctx.resume_len = 0
ctx.data_len = None
ctx.block_size = self.params.get('buffersize', 1024)
ctx.start_time = time.time()
ctx.chunk_size = None
throttle_start = None
# parse given Range
req_start, req_end, _ = parse_http_range(headers.get('Range'))
if self.params.get('continuedl', True):
# Establish possible resume length
@@ -80,43 +82,50 @@ class HttpFD(FileDownloader):
class NextFragment(Exception):
pass
def set_range(req, start, end):
range_header = 'bytes=%d-' % start
if end:
range_header += compat_str(end)
req.add_header('Range', range_header)
def establish_connection():
ctx.chunk_size = (random.randint(int(chunk_size * 0.95), chunk_size)
if not is_test and chunk_size else chunk_size)
if ctx.resume_len > 0:
range_start = ctx.resume_len
if req_start is not None:
# offset the beginning of Range to be within request
range_start += req_start
if ctx.is_resume:
self.report_resuming_byte(ctx.resume_len)
ctx.open_mode = 'ab'
elif req_start is not None:
range_start = req_start
elif ctx.chunk_size > 0:
range_start = 0
else:
range_start = None
ctx.is_resume = False
range_end = range_start + ctx.chunk_size - 1 if ctx.chunk_size else None
if range_end and ctx.data_len is not None and range_end >= ctx.data_len:
range_end = ctx.data_len - 1
has_range = range_start is not None
ctx.has_range = has_range
if ctx.chunk_size:
chunk_aware_end = range_start + ctx.chunk_size - 1
# we're not allowed to download outside Range
range_end = chunk_aware_end if req_end is None else min(chunk_aware_end, req_end)
elif req_end is not None:
# there's no need for chunked downloads, so download until the end of Range
range_end = req_end
else:
range_end = None
if try_call(lambda: range_start > range_end):
ctx.resume_len = 0
ctx.open_mode = 'wb'
raise RetryDownload(Exception(f'Conflicting range. (start={range_start} > end={range_end})'))
if try_call(lambda: range_end >= ctx.content_len):
range_end = ctx.content_len - 1
request = sanitized_Request(url, request_data, headers)
has_range = range_start is not None
if has_range:
set_range(request, range_start, range_end)
request.add_header('Range', f'bytes={int(range_start)}-{int_or_none(range_end) or ""}')
# Establish connection
try:
try:
ctx.data = self.ydl.urlopen(request)
except (compat_urllib_error.URLError, ) as err:
# reason may not be available, e.g. for urllib2.HTTPError on python 2.6
reason = getattr(err, 'reason', None)
if isinstance(reason, socket.timeout):
raise RetryDownload(err)
raise err
# When trying to resume, Content-Range HTTP header of response has to be checked
# to match the value of requested Range HTTP header. This is due to a webservers
# that don't support resuming and serve a whole file with no Content-Range
@@ -124,13 +133,9 @@ class HttpFD(FileDownloader):
# https://github.com/ytdl-org/youtube-dl/issues/6057#issuecomment-126129799)
if has_range:
content_range = ctx.data.headers.get('Content-Range')
if content_range:
content_range_m = re.search(r'bytes (\d+)-(\d+)?(?:/(\d+))?', content_range)
content_range_start, content_range_end, content_len = parse_http_range(content_range)
if content_range_start is not None and range_start == content_range_start:
# Content-Range is present and matches requested Range, resume is possible
if content_range_m:
if range_start == int(content_range_m.group(1)):
content_range_end = int_or_none(content_range_m.group(2))
content_len = int_or_none(content_range_m.group(3))
accept_content_len = (
# Non-chunked download
not ctx.chunk_size
@@ -139,7 +144,9 @@ class HttpFD(FileDownloader):
or content_range_end == range_end
or content_len < range_end)
if accept_content_len:
ctx.data_len = content_len
ctx.content_len = content_len
if content_len or req_end:
ctx.data_len = min(content_len or req_end, req_end or content_len) - (req_start or 0)
return
# Content-Range is either not present or invalid. Assuming remote webserver is
# trying to send the whole file, resume is not possible, so wiping the local file
@@ -147,8 +154,7 @@ class HttpFD(FileDownloader):
self.report_unable_to_resume()
ctx.resume_len = 0
ctx.open_mode = 'wb'
ctx.data_len = int_or_none(ctx.data.info().get('Content-length', None))
return
ctx.data_len = ctx.content_len = int_or_none(ctx.data.info().get('Content-length', None))
except (compat_urllib_error.HTTPError, ) as err:
if err.code == 416:
# Unable to resume (requested range not satisfiable)
@@ -190,16 +196,16 @@ class HttpFD(FileDownloader):
# Unexpected HTTP error
raise
raise RetryDownload(err)
except socket.timeout as err:
raise RetryDownload(err)
except socket.error as err:
if err.errno in (errno.ECONNRESET, errno.ETIMEDOUT):
# Connection reset is no problem, just retry
raise RetryDownload(err)
except compat_urllib_error.URLError as err:
if isinstance(err.reason, ssl.CertificateError):
raise
raise RetryDownload(err)
# In urllib.request.AbstractHTTPHandler, the response is partially read on request.
# Any errors that occur during this will not be wrapped by URLError
except RESPONSE_READ_EXCEPTIONS as err:
raise RetryDownload(err)
def download():
nonlocal throttle_start
data_len = ctx.data.info().get('Content-length', None)
# Range HTTP header may be ignored/unsupported by a webserver
@@ -242,16 +248,8 @@ class HttpFD(FileDownloader):
try:
# Download and write
data_block = ctx.data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
# socket.timeout is a subclass of socket.error but may not have
# errno set
except socket.timeout as e:
retry(e)
except socket.error as e:
# SSLError on python 2 (inherits socket.error) may have
# no errno set but this error message
if e.errno in (errno.ECONNRESET, errno.ETIMEDOUT) or getattr(e, 'message', None) == 'The read operation timed out':
retry(e)
raise
except RESPONSE_READ_EXCEPTIONS as err:
retry(err)
byte_counter += len(data_block)
@@ -322,16 +320,16 @@ class HttpFD(FileDownloader):
if speed and speed < (self.params.get('throttledratelimit') or 0):
# The speed must stay below the limit for 3 seconds
# This prevents raising error when the speed temporarily goes down
if throttle_start is None:
throttle_start = now
elif now - throttle_start > 3:
if ctx.throttle_start is None:
ctx.throttle_start = now
elif now - ctx.throttle_start > 3:
if ctx.stream is not None and ctx.tmpfilename != '-':
ctx.stream.close()
raise ThrottledDownload()
elif speed:
throttle_start = None
ctx.throttle_start = None
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
if not is_test and ctx.chunk_size and ctx.content_len is not None and byte_counter < ctx.content_len:
ctx.resume_len = byte_counter
# ctx.block_size = block_size
raise NextFragment()

View File

@@ -263,9 +263,11 @@ class IsmFD(FragmentFD):
count = 0
while count <= fragment_retries:
try:
success, frag_content = self._download_fragment(ctx, segment['url'], info_dict)
success = self._download_fragment(ctx, segment['url'], info_dict)
if not success:
return False
frag_content = self._read_fragment(ctx)
if not extra_state['ism_track_written']:
tfhd_data = extract_box_data(frag_content, [b'moof', b'traf', b'tfhd'])
info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0]

View File

@@ -166,10 +166,15 @@ body > figure > img {
if (i + 1) <= ctx['fragment_index']:
continue
fragment_url = fragment.get('url')
if not fragment_url:
assert fragment_base_url
fragment_url = urljoin(fragment_base_url, fragment['path'])
success, frag_content = self._download_fragment(ctx, fragment_url, info_dict)
success = self._download_fragment(ctx, fragment_url, info_dict)
if not success:
continue
frag_content = self._read_fragment(ctx)
mime_type = b'image/jpeg'
if frag_content.startswith(b'\x89PNG\r\n\x1a\n'):

View File

@@ -22,6 +22,9 @@ class YoutubeLiveChatFD(FragmentFD):
def real_download(self, filename, info_dict):
video_id = info_dict['video_id']
self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
if not self.params.get('skip_download') and info_dict['protocol'] == 'youtube_live_chat':
self.report_warning('Live chat download runs until the livestream ends. '
'If you wish to download the video simultaneously, run a separate yt-dlp instance')
fragment_retries = self.params.get('fragment_retries', 0)
test = self.params.get('test', False)
@@ -112,9 +115,10 @@ class YoutubeLiveChatFD(FragmentFD):
count = 0
while count <= fragment_retries:
try:
success, raw_fragment = dl_fragment(url, request_data, headers)
success = dl_fragment(url, request_data, headers)
if not success:
return False, None, None, None
raw_fragment = self._read_fragment(ctx)
try:
data = ie.extract_yt_initial_data(video_id, raw_fragment.decode('utf-8', 'replace'))
except RegexNotFoundError:
@@ -142,9 +146,10 @@ class YoutubeLiveChatFD(FragmentFD):
self._prepare_and_start_frag_download(ctx, info_dict)
success, raw_fragment = dl_fragment(info_dict['url'])
success = dl_fragment(info_dict['url'])
if not success:
return False
raw_fragment = self._read_fragment(ctx)
try:
data = ie.extract_yt_initial_data(video_id, raw_fragment.decode('utf-8', 'replace'))
except RegexNotFoundError:

View File

@@ -213,7 +213,7 @@ class ABCIViewIE(InfoExtractor):
'hdnea': token,
})
for sd in ('720', 'sd', 'sd-low'):
for sd in ('1080', '720', 'sd', 'sd-low'):
sd_url = try_get(
stream, lambda x: x['streams']['hls'][sd], compat_str)
if not sd_url:

476
yt_dlp/extractor/abematv.py Normal file
View File

@@ -0,0 +1,476 @@
import io
import json
import time
import hashlib
import hmac
import re
import struct
from base64 import urlsafe_b64encode
from binascii import unhexlify
from .common import InfoExtractor
from ..aes import aes_ecb_decrypt
from ..compat import (
compat_urllib_response,
compat_urllib_parse_urlparse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
decode_base,
int_or_none,
random_uuidv4,
request_to_url,
time_seconds,
update_url_query,
traverse_obj,
intlist_to_bytes,
bytes_to_intlist,
urljoin,
)
# NOTE: network handler related code is temporary thing until network stack overhaul PRs are merged (#2861/#2862)
def add_opener(ydl, handler):
''' Add a handler for opening URLs, like _download_webpage '''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
ydl._opener.add_handler(handler)
def remove_opener(ydl, handler):
'''
Remove handler(s) for opening URLs
@param handler Either handler object itself or handler type.
Specifying handler type will remove all handler which isinstance returns True.
'''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
opener = ydl._opener
assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
if isinstance(handler, (type, tuple)):
find_cp = lambda x: isinstance(x, handler)
else:
find_cp = lambda x: x is handler
removed = []
for meth in dir(handler):
if meth in ["redirect_request", "do_open", "proxy_open"]:
# oops, coincidental match
continue
i = meth.find("_")
protocol = meth[:i]
condition = meth[i + 1:]
if condition.startswith("error"):
j = condition.find("_") + i + 1
kind = meth[j + 1:]
try:
kind = int(kind)
except ValueError:
pass
lookup = opener.handle_error.get(protocol, {})
opener.handle_error[protocol] = lookup
elif condition == "open":
kind = protocol
lookup = opener.handle_open
elif condition == "response":
kind = protocol
lookup = opener.process_response
elif condition == "request":
kind = protocol
lookup = opener.process_request
else:
continue
handlers = lookup.setdefault(kind, [])
if handlers:
handlers[:] = [x for x in handlers if not find_cp(x)]
removed.append(x for x in handlers if find_cp(x))
if removed:
for x in opener.handlers:
if find_cp(x):
x.add_parent(None)
opener.handlers[:] = [x for x in opener.handlers if not find_cp(x)]
class AbemaLicenseHandler(compat_urllib_request.BaseHandler):
handler_order = 499
STRTABLE = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
HKEY = b'3AF0298C219469522A313570E8583005A642E73EDD58E3EA2FB7339D3DF1597E'
def __init__(self, ie: 'AbemaTVIE'):
# the protcol that this should really handle is 'abematv-license://'
# abematv_license_open is just a placeholder for development purposes
# ref. https://github.com/python/cpython/blob/f4c03484da59049eb62a9bf7777b963e2267d187/Lib/urllib/request.py#L510
setattr(self, 'abematv-license_open', getattr(self, 'abematv_license_open'))
self.ie = ie
def _get_videokey_from_ticket(self, ticket):
to_show = self.ie._downloader.params.get('verbose', False)
media_token = self.ie._get_media_token(to_show=to_show)
license_response = self.ie._download_json(
'https://license.abema.io/abematv-hls', None, note='Requesting playback license' if to_show else False,
query={'t': media_token},
data=json.dumps({
'kv': 'a',
'lt': ticket
}).encode('utf-8'),
headers={
'Content-Type': 'application/json',
})
res = decode_base(license_response['k'], self.STRTABLE)
encvideokey = bytes_to_intlist(struct.pack('>QQ', res >> 64, res & 0xffffffffffffffff))
h = hmac.new(
unhexlify(self.HKEY),
(license_response['cid'] + self.ie._DEVICE_ID).encode('utf-8'),
digestmod=hashlib.sha256)
enckey = bytes_to_intlist(h.digest())
return intlist_to_bytes(aes_ecb_decrypt(encvideokey, enckey))
def abematv_license_open(self, url):
url = request_to_url(url)
ticket = compat_urllib_parse_urlparse(url).netloc
response_data = self._get_videokey_from_ticket(ticket)
return compat_urllib_response.addinfourl(io.BytesIO(response_data), headers={
'Content-Length': len(response_data),
}, url=url, code=200)
class AbemaTVBaseIE(InfoExtractor):
def _extract_breadcrumb_list(self, webpage, video_id):
for jld in re.finditer(
r'(?is)</span></li></ul><script[^>]+type=(["\']?)application/ld\+json\1[^>]*>(?P<json_ld>.+?)</script>',
webpage):
jsonld = self._parse_json(jld.group('json_ld'), video_id, fatal=False)
if jsonld:
if jsonld.get('@type') != 'BreadcrumbList':
continue
trav = traverse_obj(jsonld, ('itemListElement', ..., 'name'))
if trav:
return trav
return []
class AbemaTVIE(AbemaTVBaseIE):
_VALID_URL = r'https?://abema\.tv/(?P<type>now-on-air|video/episode|channels/.+?/slots)/(?P<id>[^?/]+)'
_NETRC_MACHINE = 'abematv'
_TESTS = [{
'url': 'https://abema.tv/video/episode/194-25_s2_p1',
'info_dict': {
'id': '194-25_s2_p1',
'title': '第1話 「チーズケーキ」 「モーニング再び」',
'series': '異世界食堂2',
'series_number': 2,
'episode': '第1話 「チーズケーキ」 「モーニング再び」',
'episode_number': 1,
},
'skip': 'expired',
}, {
'url': 'https://abema.tv/channels/anime-live2/slots/E8tvAnMJ7a9a5d',
'info_dict': {
'id': 'E8tvAnMJ7a9a5d',
'title': 'ゆるキャン△ SEASON 全話一挙【無料ビデオ72時間】',
'series': 'ゆるキャン△ SEASON',
'episode': 'ゆるキャン△ SEASON 全話一挙【無料ビデオ72時間】',
'series_number': 2,
'episode_number': 1,
'description': 'md5:9c5a3172ae763278f9303922f0ea5b17',
},
'skip': 'expired',
}, {
'url': 'https://abema.tv/video/episode/87-877_s1282_p31047',
'info_dict': {
'id': 'E8tvAnMJ7a9a5d',
'title': '第5話『光射す』',
'description': 'md5:56d4fc1b4f7769ded5f923c55bb4695d',
'thumbnail': r're:https://hayabusa\.io/.+',
'series': '相棒',
'episode': '第5話『光射す』',
},
'skip': 'expired',
}, {
'url': 'https://abema.tv/now-on-air/abema-anime',
'info_dict': {
'id': 'abema-anime',
# this varies
# 'title': '女子高生の無駄づかい 全話一挙【無料ビデオ72時間】',
'description': 'md5:55f2e61f46a17e9230802d7bcc913d5f',
'is_live': True,
},
'skip': 'Not supported until yt-dlp implements native live downloader OR AbemaTV can start a local HTTP server',
}]
_USERTOKEN = None
_DEVICE_ID = None
_TIMETABLE = None
_MEDIATOKEN = None
_SECRETKEY = b'v+Gjs=25Aw5erR!J8ZuvRrCx*rGswhB&qdHd_SYerEWdU&a?3DzN9BRbp5KwY4hEmcj5#fykMjJ=AuWz5GSMY-d@H7DMEh3M@9n2G552Us$$k9cD=3TxwWe86!x#Zyhe'
def _generate_aks(self, deviceid):
deviceid = deviceid.encode('utf-8')
# add 1 hour and then drop minute and secs
ts_1hour = int((time_seconds(hours=9) // 3600 + 1) * 3600)
time_struct = time.gmtime(ts_1hour)
ts_1hour_str = str(ts_1hour).encode('utf-8')
tmp = None
def mix_once(nonce):
nonlocal tmp
h = hmac.new(self._SECRETKEY, digestmod=hashlib.sha256)
h.update(nonce)
tmp = h.digest()
def mix_tmp(count):
nonlocal tmp
for i in range(count):
mix_once(tmp)
def mix_twist(nonce):
nonlocal tmp
mix_once(urlsafe_b64encode(tmp).rstrip(b'=') + nonce)
mix_once(self._SECRETKEY)
mix_tmp(time_struct.tm_mon)
mix_twist(deviceid)
mix_tmp(time_struct.tm_mday % 5)
mix_twist(ts_1hour_str)
mix_tmp(time_struct.tm_hour % 5)
return urlsafe_b64encode(tmp).rstrip(b'=').decode('utf-8')
def _get_device_token(self):
if self._USERTOKEN:
return self._USERTOKEN
self._DEVICE_ID = random_uuidv4()
aks = self._generate_aks(self._DEVICE_ID)
user_data = self._download_json(
'https://api.abema.io/v1/users', None, note='Authorizing',
data=json.dumps({
'deviceId': self._DEVICE_ID,
'applicationKeySecret': aks,
}).encode('utf-8'),
headers={
'Content-Type': 'application/json',
})
self._USERTOKEN = user_data['token']
# don't allow adding it 2 times or more, though it's guarded
remove_opener(self._downloader, AbemaLicenseHandler)
add_opener(self._downloader, AbemaLicenseHandler(self))
return self._USERTOKEN
def _get_media_token(self, invalidate=False, to_show=True):
if not invalidate and self._MEDIATOKEN:
return self._MEDIATOKEN
self._MEDIATOKEN = self._download_json(
'https://api.abema.io/v1/media/token', None, note='Fetching media token' if to_show else False,
query={
'osName': 'android',
'osVersion': '6.0.1',
'osLang': 'ja_JP',
'osTimezone': 'Asia/Tokyo',
'appId': 'tv.abema',
'appVersion': '3.27.1'
}, headers={
'Authorization': 'bearer ' + self._get_device_token()
})['token']
return self._MEDIATOKEN
def _perform_login(self, username, password):
if '@' in username: # don't strictly check if it's email address or not
ep, method = 'user/email', 'email'
else:
ep, method = 'oneTimePassword', 'userId'
login_response = self._download_json(
f'https://api.abema.io/v1/auth/{ep}', None, note='Logging in',
data=json.dumps({
method: username,
'password': password
}).encode('utf-8'), headers={
'Authorization': 'bearer ' + self._get_device_token(),
'Origin': 'https://abema.tv',
'Referer': 'https://abema.tv/',
'Content-Type': 'application/json',
})
self._USERTOKEN = login_response['token']
self._get_media_token(True)
def _real_extract(self, url):
# starting download using infojson from this extractor is undefined behavior,
# and never be fixed in the future; you must trigger downloads by directly specifing URL.
# (unless there's a way to hook before downloading by extractor)
video_id, video_type = self._match_valid_url(url).group('id', 'type')
headers = {
'Authorization': 'Bearer ' + self._get_device_token(),
}
video_type = video_type.split('/')[-1]
webpage = self._download_webpage(url, video_id)
canonical_url = self._search_regex(
r'<link\s+rel="canonical"\s*href="(.+?)"', webpage, 'canonical URL',
default=url)
info = self._search_json_ld(webpage, video_id, default={})
title = self._search_regex(
r'<span\s*class=".+?EpisodeTitleBlock__title">(.+?)</span>', webpage, 'title', default=None)
if not title:
jsonld = None
for jld in re.finditer(
r'(?is)<span\s*class="com-m-Thumbnail__image">(?:</span>)?<script[^>]+type=(["\']?)application/ld\+json\1[^>]*>(?P<json_ld>.+?)</script>',
webpage):
jsonld = self._parse_json(jld.group('json_ld'), video_id, fatal=False)
if jsonld:
break
if jsonld:
title = jsonld.get('caption')
if not title and video_type == 'now-on-air':
if not self._TIMETABLE:
# cache the timetable because it goes to 5MiB in size (!!)
self._TIMETABLE = self._download_json(
'https://api.abema.io/v1/timetable/dataSet?debug=false', video_id,
headers=headers)
now = time_seconds(hours=9)
for slot in self._TIMETABLE.get('slots', []):
if slot.get('channelId') != video_id:
continue
if slot['startAt'] <= now and now < slot['endAt']:
title = slot['title']
break
# read breadcrumb on top of page
breadcrumb = self._extract_breadcrumb_list(webpage, video_id)
if breadcrumb:
# breadcrumb list translates to: (example is 1st test for this IE)
# Home > Anime (genre) > Isekai Shokudo 2 (series name) > Episode 1 "Cheese cakes" "Morning again" (episode title)
# hence this works
info['series'] = breadcrumb[-2]
info['episode'] = breadcrumb[-1]
if not title:
title = info['episode']
description = self._html_search_regex(
(r'<p\s+class="com-video-EpisodeDetailsBlock__content"><span\s+class=".+?">(.+?)</span></p><div',
r'<span\s+class=".+?SlotSummary.+?">(.+?)</span></div><div',),
webpage, 'description', default=None, group=1)
if not description:
og_desc = self._html_search_meta(
('description', 'og:description', 'twitter:description'), webpage)
if og_desc:
description = re.sub(r'''(?sx)
^(.+?)(?:
アニメの動画を無料で見るならABEMA| # anime
.+ # applies for most of categories
)?
''', r'\1', og_desc)
# canonical URL may contain series and episode number
mobj = re.search(r's(\d+)_p(\d+)$', canonical_url)
if mobj:
seri = int_or_none(mobj.group(1), default=float('inf'))
epis = int_or_none(mobj.group(2), default=float('inf'))
info['series_number'] = seri if seri < 100 else None
# some anime like Detective Conan (though not available in AbemaTV)
# has more than 1000 episodes (1026 as of 2021/11/15)
info['episode_number'] = epis if epis < 2000 else None
is_live, m3u8_url = False, None
if video_type == 'now-on-air':
is_live = True
channel_url = 'https://api.abema.io/v1/channels'
if video_id == 'news-global':
channel_url = update_url_query(channel_url, {'division': '1'})
onair_channels = self._download_json(channel_url, video_id)
for ch in onair_channels['channels']:
if video_id == ch['id']:
m3u8_url = ch['playback']['hls']
break
else:
raise ExtractorError(f'Cannot find on-air {video_id} channel.', expected=True)
elif video_type == 'episode':
api_response = self._download_json(
f'https://api.abema.io/v1/video/programs/{video_id}', video_id,
note='Checking playability',
headers=headers)
ondemand_types = traverse_obj(api_response, ('terms', ..., 'onDemandType'), default=[])
if 3 not in ondemand_types:
# cannot acquire decryption key for these streams
self.report_warning('This is a premium-only stream')
m3u8_url = f'https://vod-abematv.akamaized.net/program/{video_id}/playlist.m3u8'
elif video_type == 'slots':
api_response = self._download_json(
f'https://api.abema.io/v1/media/slots/{video_id}', video_id,
note='Checking playability',
headers=headers)
if not traverse_obj(api_response, ('slot', 'flags', 'timeshiftFree'), default=False):
self.report_warning('This is a premium-only stream')
m3u8_url = f'https://vod-abematv.akamaized.net/slot/{video_id}/playlist.m3u8'
else:
raise ExtractorError('Unreachable')
if is_live:
self.report_warning("This is a livestream; yt-dlp doesn't support downloading natively, but FFmpeg cannot handle m3u8 manifests from AbemaTV")
self.report_warning('Please consider using Streamlink to download these streams (https://github.com/streamlink/streamlink)')
formats = self._extract_m3u8_formats(
m3u8_url, video_id, ext='mp4', live=is_live)
info.update({
'id': video_id,
'title': title,
'description': description,
'formats': formats,
'is_live': is_live,
})
return info
class AbemaTVTitleIE(AbemaTVBaseIE):
_VALID_URL = r'https?://abema\.tv/video/title/(?P<id>[^?/]+)'
_TESTS = [{
'url': 'https://abema.tv/video/title/90-1597',
'info_dict': {
'id': '90-1597',
'title': 'シャッフルアイランド',
},
'playlist_mincount': 2,
}, {
'url': 'https://abema.tv/video/title/193-132',
'info_dict': {
'id': '193-132',
'title': '真心が届く~僕とスターのオフィス・ラブ!?~',
},
'playlist_mincount': 16,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
playlist_title, breadcrumb = None, self._extract_breadcrumb_list(webpage, video_id)
if breadcrumb:
playlist_title = breadcrumb[-1]
playlist = [
self.url_result(urljoin('https://abema.tv/', mobj.group(1)))
for mobj in re.finditer(r'<li\s*class=".+?EpisodeList.+?"><a\s*href="(/[^"]+?)"', webpage)]
return self.playlist_result(playlist, playlist_title=playlist_title, playlist_id=video_id)

View File

@@ -126,10 +126,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
}])
return subtitles
def _real_initialize(self):
username, password = self._get_login_info()
if not username:
return
def _perform_login(self, username, password):
try:
access_token = (self._download_json(
self._API_BASE_URL + 'authentication/login', None,

View File

@@ -14,7 +14,7 @@ class AdobeConnectIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
title = self._html_extract_title(webpage)
qs = compat_parse_qs(self._search_regex(r"swfUrl\s*=\s*'([^']+)'", webpage, 'swf url').split('?')[1])
is_live = qs.get('isLive', ['false'])[0] == 'true'
formats = []

View File

@@ -1345,6 +1345,11 @@ MSO_INFO = {
'username_field': 'username',
'password_field': 'password',
},
'Suddenlink': {
'name': 'Suddenlink',
'username_field': 'username',
'password_field': 'password',
},
}
@@ -1635,6 +1640,58 @@ class AdobePassIE(InfoExtractor):
urlh.geturl(), video_id, 'Sending final bookend',
query=hidden_data)
post_form(mvpd_confirm_page_res, 'Confirming Login')
elif mso_id == 'Suddenlink':
# Suddenlink is similar to SlingTV in using a tab history count and a meta refresh,
# but they also do a dynmaic redirect using javascript that has to be followed as well
first_bookend_page, urlh = post_form(
provider_redirect_page_res, 'Pressing Continue...')
hidden_data = self._hidden_inputs(first_bookend_page)
hidden_data['history_val'] = 1
provider_login_redirect_page_res = self._download_webpage_handle(
urlh.geturl(), video_id, 'Sending First Bookend',
query=hidden_data)
provider_login_redirect_page, urlh = provider_login_redirect_page_res
# Some website partners seem to not have the extra ajaxurl redirect step, so we check if we already
# have the login prompt or not
if 'id="password" type="password" name="password"' in provider_login_redirect_page:
provider_login_page_res = provider_login_redirect_page_res
else:
provider_tryauth_url = self._html_search_regex(
r'url:\s*[\'"]([^\'"]+)', provider_login_redirect_page, 'ajaxurl')
provider_tryauth_page = self._download_webpage(
provider_tryauth_url, video_id, 'Submitting TryAuth',
query=hidden_data)
provider_login_page_res = self._download_webpage_handle(
f'https://authorize.suddenlink.net/saml/module.php/authSynacor/login.php?AuthState={provider_tryauth_page}',
video_id, 'Getting Login Page',
query=hidden_data)
provider_association_redirect, urlh = post_form(
provider_login_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password
})
provider_refresh_redirect_url = extract_redirect_url(
provider_association_redirect, url=urlh.geturl())
last_bookend_page, urlh = self._download_webpage_handle(
provider_refresh_redirect_url, video_id,
'Downloading Auth Association Redirect Page')
hidden_data = self._hidden_inputs(last_bookend_page)
hidden_data['history_val'] = 3
mvpd_confirm_page_res = self._download_webpage_handle(
urlh.geturl(), video_id, 'Sending Final Bookend',
query=hidden_data)
post_form(mvpd_confirm_page_res, 'Confirming Login')
else:
# Some providers (e.g. DIRECTV NOW) have another meta refresh

View File

@@ -1,14 +1,16 @@
# coding: utf-8
from __future__ import unicode_literals
import functools
import re
from .common import InfoExtractor
from ..compat import compat_xpath
from ..utils import (
ExtractorError,
OnDemandPagedList,
date_from_str,
determine_ext,
ExtractorError,
int_or_none,
qualities,
traverse_obj,
@@ -32,7 +34,7 @@ class AfreecaTVIE(InfoExtractor):
/app/(?:index|read_ucc_bbs)\.cgi|
/player/[Pp]layer\.(?:swf|html)
)\?.*?\bnTitleNo=|
vod\.afreecatv\.com/PLAYER/STATION/
vod\.afreecatv\.com/(PLAYER/STATION|player)/
)
(?P<id>\d+)
'''
@@ -170,6 +172,9 @@ class AfreecaTVIE(InfoExtractor):
}, {
'url': 'http://vod.afreecatv.com/PLAYER/STATION/15055030',
'only_matching': True,
}, {
'url': 'http://vod.afreecatv.com/player/15055030',
'only_matching': True,
}]
@staticmethod
@@ -181,14 +186,7 @@ class AfreecaTVIE(InfoExtractor):
video_key['part'] = int(m.group('part'))
return video_key
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
login_form = {
'szWork': 'login',
'szType': 'json',
@@ -416,26 +414,35 @@ class AfreecaTVLiveIE(AfreecaTVIE):
def _real_extract(self, url):
broadcaster_id, broadcast_no = self._match_valid_url(url).group('id', 'bno')
password = self.get_param('videopassword')
info = self._download_json(self._LIVE_API_URL, broadcaster_id, fatal=False,
data=urlencode_postdata({'bid': broadcaster_id})) or {}
channel_info = info.get('CHANNEL') or {}
broadcaster_id = channel_info.get('BJID') or broadcaster_id
broadcast_no = channel_info.get('BNO') or broadcast_no
password_protected = channel_info.get('BPWD')
if not broadcast_no:
raise ExtractorError(f'Unable to extract broadcast number ({broadcaster_id} may not be live)', expected=True)
if password_protected == 'Y' and password is None:
raise ExtractorError(
'This livestream is protected by a password, use the --video-password option',
expected=True)
formats = []
quality_key = qualities(self._QUALITIES)
for quality_str in self._QUALITIES:
aid_response = self._download_json(
self._LIVE_API_URL, broadcast_no, fatal=False,
data=urlencode_postdata({
params = {
'bno': broadcast_no,
'stream_type': 'common',
'type': 'aid',
'quality': quality_str,
}),
}
if password is not None:
params['pwd'] = password
aid_response = self._download_json(
self._LIVE_API_URL, broadcast_no, fatal=False,
data=urlencode_postdata(params),
note=f'Downloading access token for {quality_str} stream',
errnote=f'Unable to download access token for {quality_str} stream')
aid = traverse_obj(aid_response, ('CHANNEL', 'AID'))
@@ -477,3 +484,57 @@ class AfreecaTVLiveIE(AfreecaTVIE):
'formats': formats,
'is_live': True,
}
class AfreecaTVUserIE(InfoExtractor):
IE_NAME = 'afreecatv:user'
_VALID_URL = r'https?://bj\.afreeca(?:tv)?\.com/(?P<id>[^/]+)/vods/?(?P<slug_type>[^/]+)?'
_TESTS = [{
'url': 'https://bj.afreecatv.com/ryuryu24/vods/review',
'info_dict': {
'_type': 'playlist',
'id': 'ryuryu24',
'title': 'ryuryu24 - review',
},
'playlist_count': 218,
}, {
'url': 'https://bj.afreecatv.com/parang1995/vods/highlight',
'info_dict': {
'_type': 'playlist',
'id': 'parang1995',
'title': 'parang1995 - highlight',
},
'playlist_count': 997,
}, {
'url': 'https://bj.afreecatv.com/ryuryu24/vods',
'info_dict': {
'_type': 'playlist',
'id': 'ryuryu24',
'title': 'ryuryu24 - all',
},
'playlist_count': 221,
}, {
'url': 'https://bj.afreecatv.com/ryuryu24/vods/balloonclip',
'info_dict': {
'_type': 'playlist',
'id': 'ryuryu24',
'title': 'ryuryu24 - balloonclip',
},
'playlist_count': 0,
}]
_PER_PAGE = 60
def _fetch_page(self, user_id, user_type, page):
page += 1
info = self._download_json(f'https://bjapi.afreecatv.com/api/{user_id}/vods/{user_type}', user_id,
query={'page': page, 'per_page': self._PER_PAGE, 'orderby': 'reg_date'},
note=f'Downloading {user_type} video page {page}')
for item in info['data']:
yield self.url_result(
f'https://vod.afreecatv.com/player/{item["title_no"]}/', AfreecaTVIE, item['title_no'])
def _real_extract(self, url):
user_id, user_type = self._match_valid_url(url).group('id', 'slug_type')
user_type = user_type or 'all'
entries = OnDemandPagedList(functools.partial(self._fetch_page, user_id, user_type), self._PER_PAGE)
return self.playlist_result(entries, user_id, f'{user_id} - {user_type}')

View File

@@ -18,7 +18,7 @@ class AliExpressLiveIE(InfoExtractor):
'id': '2800002704436634',
'ext': 'mp4',
'title': 'CASIMA7.22',
'thumbnail': r're:http://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
'uploader': 'CASIMA Official Store',
'timestamp': 1500717600,
'upload_date': '20170722',

View File

@@ -7,6 +7,7 @@ from ..utils import (
int_or_none,
qualities,
remove_end,
strip_or_none,
try_get,
unified_timestamp,
url_basename,
@@ -102,10 +103,7 @@ class AllocineIE(InfoExtractor):
video_id = display_id
media_data = self._download_json(
'http://www.allocine.fr/ws/AcVisiondataV5.ashx?media=%s' % video_id, display_id)
title = remove_end(
self._html_search_regex(
r'(?s)<title>(.+?)</title>', webpage, 'title').strip(),
' - AlloCiné')
title = remove_end(strip_or_none(self._html_extract_title(webpage), ' - AlloCiné'))
for key, value in media_data['video'].items():
if not key.endswith('Path'):
continue

View File

@@ -0,0 +1,87 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
clean_html,
dict_get,
get_element_by_class,
int_or_none,
unified_strdate,
url_or_none,
)
class Alsace20TVBaseIE(InfoExtractor):
def _extract_video(self, video_id, url=None):
info = self._download_json(
'https://www.alsace20.tv/visionneuse/visio_v9_js.php?key=%s&habillage=0&mode=html' % (video_id, ),
video_id) or {}
title = info.get('titre')
formats = []
for res, fmt_url in (info.get('files') or {}).items():
formats.extend(
self._extract_smil_formats(fmt_url, video_id, fatal=False)
if '/smil:_' in fmt_url
else self._extract_mpd_formats(fmt_url, video_id, mpd_id=res, fatal=False))
self._sort_formats(formats)
webpage = (url and self._download_webpage(url, video_id, fatal=False)) or ''
thumbnail = url_or_none(dict_get(info, ('image', 'preview', )) or self._og_search_thumbnail(webpage))
upload_date = self._search_regex(r'/(\d{6})_', thumbnail, 'upload_date', default=None)
upload_date = unified_strdate('20%s-%s-%s' % (upload_date[:2], upload_date[2:4], upload_date[4:])) if upload_date else None
return {
'id': video_id,
'title': title,
'formats': formats,
'description': clean_html(get_element_by_class('wysiwyg', webpage)),
'upload_date': upload_date,
'thumbnail': thumbnail,
'duration': int_or_none(self._og_search_property('video:duration', webpage) if webpage else None),
'view_count': int_or_none(info.get('nb_vues')),
}
class Alsace20TVIE(Alsace20TVBaseIE):
_VALID_URL = r'https?://(?:www\.)?alsace20\.tv/(?:[\w-]+/)+[\w-]+-(?P<id>[\w]+)'
_TESTS = [{
'url': 'https://www.alsace20.tv/VOD/Actu/JT/Votre-JT-jeudi-3-fevrier-lyNHCXpYJh.html',
'info_dict': {
'id': 'lyNHCXpYJh',
'ext': 'mp4',
'description': 'md5:fc0bc4a0692d3d2dba4524053de4c7b7',
'title': 'Votre JT du jeudi 3 février',
'upload_date': '20220203',
'thumbnail': r're:https?://.+\.jpg',
'duration': 1073,
'view_count': int,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_video(video_id, url)
class Alsace20TVEmbedIE(Alsace20TVBaseIE):
_VALID_URL = r'https?://(?:www\.)?alsace20\.tv/emb/(?P<id>[\w]+)'
_TESTS = [{
'url': 'https://www.alsace20.tv/emb/lyNHCXpYJh',
# 'md5': 'd91851bf9af73c0ad9b2cdf76c127fbb',
'info_dict': {
'id': 'lyNHCXpYJh',
'ext': 'mp4',
'title': 'Votre JT du jeudi 3 février',
'upload_date': '20220203',
'thumbnail': r're:https?://.+\.jpg',
'view_count': int,
},
'params': {
'format': 'bestvideo',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_video(video_id)

View File

@@ -74,14 +74,7 @@ class AluraIE(InfoExtractor):
"formats": formats
}
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
pass
def _perform_login(self, username, password):
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login popup')

View File

@@ -15,25 +15,21 @@ from ..compat import compat_HTTPError
class AnimeLabBaseIE(InfoExtractor):
_LOGIN_REQUIRED = True
_LOGIN_URL = 'https://www.animelab.com/login'
_NETRC_MACHINE = 'animelab'
_LOGGED_IN = False
def _login(self):
def is_logged_in(login_webpage):
return 'Sign In' not in login_webpage
def _is_logged_in(self, login_page=None):
if not self._LOGGED_IN:
if not login_page:
login_page = self._download_webpage(self._LOGIN_URL, None, 'Downloading login page')
AnimeLabBaseIE._LOGGED_IN = 'Sign In' not in login_page
return self._LOGGED_IN
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
# Check if already logged in
if is_logged_in(login_page):
def _perform_login(self, username, password):
if self._is_logged_in():
return
(username, password) = self._get_login_info()
if username is None and self._LOGIN_REQUIRED:
self.raise_login_required('Login is required to access any AnimeLab content')
login_form = {
'email': username,
'password': password,
@@ -47,17 +43,14 @@ class AnimeLabBaseIE(InfoExtractor):
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
raise ExtractorError('Unable to log in (wrong credentials?)', expected=True)
else:
raise
# if login was successful
if is_logged_in(response):
return
if not self._is_logged_in(response):
raise ExtractorError('Unable to login (cannot verify if logged in)')
def _real_initialize(self):
self._login()
if not self._is_logged_in():
self.raise_login_required('Login is required to access any AnimeLab content')
class AnimeLabIE(AnimeLabBaseIE):

View File

@@ -53,11 +53,7 @@ class AnimeOnDemandIE(InfoExtractor):
'only_matching': True,
}]
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
@@ -93,9 +89,6 @@ class AnimeOnDemandIE(InfoExtractor):
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
def _real_extract(self, url):
anime_id = self._match_id(url)

View File

@@ -0,0 +1,143 @@
# coding: utf-8
from __future__ import unicode_literals
import re
import urllib.parse
from .common import InfoExtractor
from ..utils import (
HEADRequest,
ExtractorError,
determine_ext,
scale_thumbnails_to_max_format_width,
unescapeHTML,
)
class Ant1NewsGrBaseIE(InfoExtractor):
def _download_and_extract_api_data(self, video_id, netloc, cid=None):
url = f'{self.http_scheme()}//{netloc}{self._API_PATH}'
info = self._download_json(url, video_id, query={'cid': cid or video_id})
try:
source = info['url']
except KeyError:
raise ExtractorError('no source found for %s' % video_id)
formats, subs = (self._extract_m3u8_formats_and_subtitles(source, video_id, 'mp4')
if determine_ext(source) == 'm3u8' else ([{'url': source}], {}))
self._sort_formats(formats)
thumbnails = scale_thumbnails_to_max_format_width(
formats, [{'url': info['thumb']}], r'(?<=/imgHandler/)\d+')
return {
'id': video_id,
'title': info.get('title'),
'thumbnails': thumbnails,
'formats': formats,
'subtitles': subs,
}
class Ant1NewsGrWatchIE(Ant1NewsGrBaseIE):
IE_NAME = 'ant1newsgr:watch'
IE_DESC = 'ant1news.gr videos'
_VALID_URL = r'https?://(?P<netloc>(?:www\.)?ant1news\.gr)/watch/(?P<id>\d+)/'
_API_PATH = '/templates/data/player'
_TESTS = [{
'url': 'https://www.ant1news.gr/watch/1506168/ant1-news-09112021-stis-18-45',
'md5': '95925e6b32106754235f2417e0d2dfab',
'info_dict': {
'id': '1506168',
'ext': 'mp4',
'title': 'md5:0ad00fa66ecf8aa233d26ab0dba7514a',
'description': 'md5:18665af715a6dcfeac1d6153a44f16b0',
'thumbnail': 'https://ant1media.azureedge.net/imgHandler/640/26d46bf6-8158-4f02-b197-7096c714b2de.jpg',
},
}]
def _real_extract(self, url):
video_id, netloc = self._match_valid_url(url).group('id', 'netloc')
webpage = self._download_webpage(url, video_id)
info = self._download_and_extract_api_data(video_id, netloc)
info['description'] = self._og_search_description(webpage)
return info
class Ant1NewsGrArticleIE(Ant1NewsGrBaseIE):
IE_NAME = 'ant1newsgr:article'
IE_DESC = 'ant1news.gr articles'
_VALID_URL = r'https?://(?:www\.)?ant1news\.gr/[^/]+/article/(?P<id>\d+)/'
_TESTS = [{
'url': 'https://www.ant1news.gr/afieromata/article/549468/o-tzeims-mpont-sta-meteora-oi-apeiles-kai-o-xesikomos-ton-kalogeron',
'md5': '294f18331bb516539d72d85a82887dcc',
'info_dict': {
'id': '_xvg/m_cmbatw=',
'ext': 'mp4',
'title': 'md5:a93e8ecf2e4073bfdffcb38f59945411',
'timestamp': 1603092840,
'upload_date': '20201019',
'thumbnail': 'https://ant1media.azureedge.net/imgHandler/640/756206d2-d640-40e2-b201-3555abdfc0db.jpg',
},
}, {
'url': 'https://ant1news.gr/Society/article/620286/symmoria-anilikon-dikigoros-thymaton-ithelan-na-toys-apoteleiosoyn',
'info_dict': {
'id': '620286',
'title': 'md5:91fe569e952e4d146485740ae927662b',
},
'playlist_mincount': 2,
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
info = self._search_json_ld(webpage, video_id, expected_type='NewsArticle')
embed_urls = list(Ant1NewsGrEmbedIE._extract_urls(webpage))
if not embed_urls:
raise ExtractorError('no videos found for %s' % video_id, expected=True)
return self.playlist_from_matches(
embed_urls, video_id, info.get('title'), ie=Ant1NewsGrEmbedIE.ie_key(),
video_kwargs={'url_transparent': True, 'timestamp': info.get('timestamp')})
class Ant1NewsGrEmbedIE(Ant1NewsGrBaseIE):
IE_NAME = 'ant1newsgr:embed'
IE_DESC = 'ant1news.gr embedded videos'
_BASE_PLAYER_URL_RE = r'(?:https?:)?//(?:[a-zA-Z0-9\-]+\.)?(?:antenna|ant1news)\.gr/templates/pages/player'
_VALID_URL = rf'{_BASE_PLAYER_URL_RE}\?([^#]+&)?cid=(?P<id>[^#&]+)'
_API_PATH = '/news/templates/data/jsonPlayer'
_TESTS = [{
'url': 'https://www.antenna.gr/templates/pages/player?cid=3f_li_c_az_jw_y_u=&w=670&h=377',
'md5': 'dfc58c3a11a5a9aad2ba316ed447def3',
'info_dict': {
'id': '3f_li_c_az_jw_y_u=',
'ext': 'mp4',
'title': 'md5:a30c93332455f53e1e84ae0724f0adf7',
'thumbnail': 'https://ant1media.azureedge.net/imgHandler/640/bbe31201-3f09-4a4e-87f5-8ad2159fffe2.jpg',
},
}]
@classmethod
def _extract_urls(cls, webpage):
_EMBED_URL_RE = rf'{cls._BASE_PLAYER_URL_RE}\?(?:(?!(?P=_q1)).)+'
_EMBED_RE = rf'<iframe[^>]+?src=(?P<_q1>["\'])(?P<url>{_EMBED_URL_RE})(?P=_q1)'
for mobj in re.finditer(_EMBED_RE, webpage):
url = unescapeHTML(mobj.group('url'))
if not cls.suitable(url):
continue
yield url
def _real_extract(self, url):
video_id = self._match_id(url)
canonical_url = self._request_webpage(
HEADRequest(url), video_id,
note='Resolve canonical player URL',
errnote='Could not resolve canonical player URL').geturl()
_, netloc, _, _, query, _ = urllib.parse.urlparse(canonical_url)
cid = urllib.parse.parse_qs(query)['cid'][0]
return self._download_and_extract_api_data(video_id, netloc, cid=cid)

View File

@@ -3,7 +3,9 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
clean_html,
clean_podcast_url,
get_element_by_class,
int_or_none,
parse_iso8601,
try_get,
@@ -14,16 +16,17 @@ class ApplePodcastsIE(InfoExtractor):
_VALID_URL = r'https?://podcasts\.apple\.com/(?:[^/]+/)?podcast(?:/[^/]+){1,2}.*?\bi=(?P<id>\d+)'
_TESTS = [{
'url': 'https://podcasts.apple.com/us/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
'md5': 'df02e6acb11c10e844946a39e7222b08',
'md5': '41dc31cd650143e530d9423b6b5a344f',
'info_dict': {
'id': '1000482637777',
'ext': 'mp3',
'title': '207 - Whitney Webb Returns',
'description': 'md5:13a73bade02d2e43737751e3987e1399',
'description': 'md5:75ef4316031df7b41ced4e7b987f79c6',
'upload_date': '20200705',
'timestamp': 1593921600,
'duration': 6425,
'timestamp': 1593932400,
'duration': 6454,
'series': 'The Tim Dillon Show',
'thumbnail': 're:.+[.](png|jpe?g|webp)',
}
}, {
'url': 'https://podcasts.apple.com/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
@@ -39,24 +42,47 @@ class ApplePodcastsIE(InfoExtractor):
def _real_extract(self, url):
episode_id = self._match_id(url)
webpage = self._download_webpage(url, episode_id)
episode_data = {}
ember_data = {}
# new page type 2021-11
amp_data = self._parse_json(self._search_regex(
r'(?s)id="shoebox-media-api-cache-amp-podcasts"[^>]*>\s*({.+?})\s*<',
webpage, 'AMP data', default='{}'), episode_id, fatal=False) or {}
amp_data = try_get(amp_data,
lambda a: self._parse_json(
next(a[x] for x in iter(a) if episode_id in x),
episode_id),
dict) or {}
amp_data = amp_data.get('d') or []
episode_data = try_get(
amp_data,
lambda a: next(x for x in a
if x['type'] == 'podcast-episodes' and x['id'] == episode_id),
dict)
if not episode_data:
# try pre 2021-11 page type: TODO: consider deleting if no longer used
ember_data = self._parse_json(self._search_regex(
r'id="shoebox-ember-data-store"[^>]*>\s*({.+?})\s*<',
webpage, 'ember data'), episode_id)
r'(?s)id="shoebox-ember-data-store"[^>]*>\s*({.+?})\s*<',
webpage, 'ember data'), episode_id) or {}
ember_data = ember_data.get(episode_id) or ember_data
episode = ember_data['data']['attributes']
episode_data = try_get(ember_data, lambda x: x['data'], dict)
episode = episode_data['attributes']
description = episode.get('description') or {}
series = None
for inc in (ember_data.get('included') or []):
for inc in (amp_data or ember_data.get('included') or []):
if inc.get('type') == 'media/podcast':
series = try_get(inc, lambda x: x['attributes']['name'])
series = series or clean_html(get_element_by_class('podcast-header__identity', webpage))
return {
'id': episode_id,
'title': episode['name'],
'title': episode.get('name'),
'url': clean_podcast_url(episode['assetUrl']),
'description': description.get('standard') or description.get('short'),
'timestamp': parse_iso8601(episode.get('releaseDateTime')),
'duration': int_or_none(episode.get('durationInMilliseconds'), 1000),
'series': series,
'thumbnail': self._og_search_thumbnail(webpage),
'vcodec': 'none',
}

View File

@@ -457,7 +457,7 @@ class YoutubeWebArchiveIE(InfoExtractor):
_OLDEST_CAPTURE_DATE = 20050214000000
_NEWEST_CAPTURE_DATE = 20500101000000
def _call_cdx_api(self, item_id, url, filters: list = None, collapse: list = None, query: dict = None, note='Downloading CDX API JSON'):
def _call_cdx_api(self, item_id, url, filters: list = None, collapse: list = None, query: dict = None, note=None, fatal=False):
# CDX docs: https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md
query = {
'url': url,
@@ -468,7 +468,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'collapse': collapse or [],
**(query or {})
}
res = self._download_json('https://web.archive.org/cdx/search/cdx', item_id, note, query=query)
res = self._download_json(
'https://web.archive.org/cdx/search/cdx', item_id,
note or 'Downloading CDX API JSON', query=query, fatal=fatal)
if isinstance(res, list) and len(res) >= 2:
# format response to make it easier to use
return list(dict(zip(res[0], v)) for v in res[1:])
@@ -481,8 +483,7 @@ class YoutubeWebArchiveIE(InfoExtractor):
regex), webpage, name, default='{}'), video_id, fatal=False)
def _extract_webpage_title(self, webpage):
page_title = self._html_search_regex(
r'<title>([^<]*)</title>', webpage, 'title', default='')
page_title = self._html_extract_title(webpage, default='')
# YouTube video pages appear to always have either 'YouTube -' as prefix or '- YouTube' as suffix.
return self._html_search_regex(
r'(?:YouTube\s*-\s*(.*)$)|(?:(.*)\s*-\s*YouTube$)',

View File

@@ -124,8 +124,7 @@ class ArcPublishingIE(InfoExtractor):
formats.extend(smil_formats)
elif stream_type in ('ts', 'hls'):
m3u8_formats = self._extract_m3u8_formats(
s_url, uuid, 'mp4', 'm3u8' if is_live else 'm3u8_native',
m3u8_id='hls', fatal=False)
s_url, uuid, 'mp4', live=is_live, m3u8_id='hls', fatal=False)
if all([f.get('acodec') == 'none' for f in m3u8_formats]):
continue
for f in m3u8_formats:

View File

@@ -407,8 +407,9 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
(?:(?:beta|www)\.)?ardmediathek\.de/
(?:(?P<client>[^/]+)/)?
(?:player|live|video|(?P<playlist>sendung|sammlung))/
(?:(?P<display_id>[^?#]+)/)?
(?P<id>(?(playlist)|Y3JpZDovL)[a-zA-Z0-9]+)'''
(?:(?P<display_id>(?(playlist)[^?#]+?|[^?#]+))/)?
(?P<id>(?(playlist)|Y3JpZDovL)[a-zA-Z0-9]+)
(?(playlist)/(?P<season>\d+)?/?(?:[?#]|$))'''
_TESTS = [{
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
@@ -436,6 +437,13 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
'description': 'md5:39578c7b96c9fe50afdf5674ad985e6b',
'upload_date': '20211108',
},
}, {
'url': 'https://www.ardmediathek.de/sendung/beforeigners/beforeigners/staffel-1/Y3JpZDovL2Rhc2Vyc3RlLmRlL2JlZm9yZWlnbmVycw/1',
'playlist_count': 6,
'info_dict': {
'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL2JlZm9yZWlnbmVycw',
'title': 'beforeigners/beforeigners/staffel-1',
},
}, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'only_matching': True,
@@ -561,14 +569,15 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
break
pageNumber = pageNumber + 1
return self.playlist_result(entries, playlist_title=display_id)
return self.playlist_result(entries, playlist_id, playlist_title=display_id)
def _real_extract(self, url):
video_id, display_id, playlist_type, client = self._match_valid_url(url).group(
'id', 'display_id', 'playlist', 'client')
video_id, display_id, playlist_type, client, season_number = self._match_valid_url(url).group(
'id', 'display_id', 'playlist', 'client', 'season')
display_id, client = display_id or video_id, client or 'ard'
if playlist_type:
# TODO: Extract only specified season
return self._ARD_extract_playlist(url, video_id, display_id, client, playlist_type)
player_page = self._download_json(

View File

@@ -12,6 +12,7 @@ from ..utils import (
int_or_none,
parse_qs,
qualities,
strip_or_none,
try_get,
unified_strdate,
url_or_none,
@@ -137,6 +138,7 @@ class ArteTVIE(ArteTVBaseIE):
break
else:
lang_pref = -1
format_note = '%s, %s' % (f.get('versionCode'), f.get('versionLibelle'))
media_type = f.get('mediaType')
if media_type == 'hls':
@@ -144,14 +146,17 @@ class ArteTVIE(ArteTVBaseIE):
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=format_id, fatal=False)
for m3u8_format in m3u8_formats:
m3u8_format['language_preference'] = lang_pref
m3u8_format.update({
'language_preference': lang_pref,
'format_note': format_note,
})
formats.extend(m3u8_formats)
continue
format = {
'format_id': format_id,
'language_preference': lang_pref,
'format_note': '%s, %s' % (f.get('versionCode'), f.get('versionLibelle')),
'format_note': format_note,
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': int_or_none(f.get('bitrate')),
@@ -253,3 +258,44 @@ class ArteTVPlaylistIE(ArteTVBaseIE):
title = collection.get('title')
description = collection.get('shortDescription') or collection.get('teaserText')
return self.playlist_result(entries, playlist_id, title, description)
class ArteTVCategoryIE(ArteTVBaseIE):
_VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>%s)/videos/(?P<id>[\w-]+(?:/[\w-]+)*)/?\s*$' % ArteTVBaseIE._ARTE_LANGUAGES
_TESTS = [{
'url': 'https://www.arte.tv/en/videos/politics-and-society/',
'info_dict': {
'id': 'politics-and-society',
'title': 'Politics and society',
'description': 'Investigative documentary series, geopolitical analysis, and international commentary',
},
'playlist_mincount': 13,
},
]
@classmethod
def suitable(cls, url):
return (
not any(ie.suitable(url) for ie in (ArteTVIE, ArteTVPlaylistIE, ))
and super(ArteTVCategoryIE, cls).suitable(url))
def _real_extract(self, url):
lang, playlist_id = self._match_valid_url(url).groups()
webpage = self._download_webpage(url, playlist_id)
items = []
for video in re.finditer(
r'<a\b[^>]*?href\s*=\s*(?P<q>"|\'|\b)(?P<url>https?://www\.arte\.tv/%s/videos/[\w/-]+)(?P=q)' % lang,
webpage):
video = video.group('url')
if video == url:
continue
if any(ie.suitable(video) for ie in (ArteTVIE, ArteTVPlaylistIE, )):
items.append(video)
title = (self._og_search_title(webpage, default=None)
or self._html_search_regex(r'<title\b[^>]*>([^<]+)</title>', default=None))
title = strip_or_none(title.rsplit('|', 1)[0]) or self._generic_title(url)
return self.playlist_from_matches(items, playlist_id=playlist_id, playlist_title=title,
description=self._og_search_description(webpage, default=None))

View File

@@ -181,8 +181,7 @@ class AsianCrushPlaylistIE(AsianCrushBaseIE):
'title', default=None) or self._og_search_title(
webpage, default=None) or self._html_search_meta(
'twitter:title', webpage, 'title',
default=None) or self._search_regex(
r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
default=None) or self._html_extract_title(webpage)
if title:
title = re.sub(r'\s*\|\s*.+?$', '', title)

View File

@@ -37,9 +37,6 @@ class AtresPlayerIE(InfoExtractor):
]
_API_BASE = 'https://api.atresplayer.com/'
def _real_initialize(self):
self._login()
def _handle_error(self, e, code):
if isinstance(e.cause, compat_HTTPError) and e.cause.code == code:
error = self._parse_json(e.cause.read(), None)
@@ -48,11 +45,7 @@ class AtresPlayerIE(InfoExtractor):
raise ExtractorError(error['error_description'], expected=True)
raise
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
self._request_webpage(
self._API_BASE + 'login', None, 'Downloading login page')

View File

@@ -8,6 +8,7 @@ from ..utils import (
float_or_none,
jwt_encode_hs256,
try_get,
ExtractorError,
)
@@ -94,6 +95,11 @@ class ATVAtIE(InfoExtractor):
})
video_id, videos_data = list(videos['data'].items())[0]
error_msg = try_get(videos_data, lambda x: x['error']['title'])
if error_msg == 'Geo check failed':
self.raise_geo_restricted(error_msg)
elif error_msg:
raise ExtractorError(error_msg)
entries = [
self._extract_video_info(url, contentResource[video['id']], video)
for video in videos_data]

View File

@@ -29,6 +29,7 @@ class AudiomackIE(InfoExtractor):
}
},
# audiomack wrapper around soundcloud song
# Needs new test URL.
{
'add_ie': ['Soundcloud'],
'url': 'http://www.audiomack.com/song/hip-hop-daily/black-mamba-freestyle',

View File

@@ -11,11 +11,12 @@ class AZMedienIE(InfoExtractor):
IE_DESC = 'AZ Medien videos'
_VALID_URL = r'''(?x)
https?://
(?:www\.)?
(?:www\.|tv\.)?
(?P<host>
telezueri\.ch|
telebaern\.tv|
telem1\.ch
telem1\.ch|
tvo-online\.ch
)/
[^/]+/
(?P<id>
@@ -30,7 +31,7 @@ class AZMedienIE(InfoExtractor):
'''
_TESTS = [{
'url': 'https://www.telezueri.ch/sonntalk/bundesrats-vakanzen-eu-rahmenabkommen-133214569',
'url': 'https://tv.telezueri.ch/sonntalk/bundesrats-vakanzen-eu-rahmenabkommen-133214569',
'info_dict': {
'id': '1_anruz3wy',
'ext': 'mp4',
@@ -38,6 +39,9 @@ class AZMedienIE(InfoExtractor):
'uploader_id': 'TVOnline',
'upload_date': '20180930',
'timestamp': 1538328802,
'view_count': int,
'thumbnail': 'http://cfvod.kaltura.com/p/1719221/sp/171922100/thumbnail/entry_id/1_anruz3wy/version/100031',
'duration': 1930
},
'params': {
'skip_download': True,

153
yt_dlp/extractor/banbye.py Normal file
View File

@@ -0,0 +1,153 @@
# coding: utf-8
from __future__ import unicode_literals
import math
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlparse,
compat_parse_qs,
)
from ..utils import (
format_field,
InAdvancePagedList,
traverse_obj,
unified_timestamp,
)
class BanByeBaseIE(InfoExtractor):
_API_BASE = 'https://api.banbye.com'
_CDN_BASE = 'https://cdn.banbye.com'
_VIDEO_BASE = 'https://banbye.com/watch'
@staticmethod
def _extract_playlist_id(url, param='playlist'):
return compat_parse_qs(
compat_urllib_parse_urlparse(url).query).get(param, [None])[0]
def _extract_playlist(self, playlist_id):
data = self._download_json(f'{self._API_BASE}/playlists/{playlist_id}', playlist_id)
return self.playlist_result([
self.url_result(f'{self._VIDEO_BASE}/{video_id}', BanByeIE)
for video_id in data['videoIds']], playlist_id, data.get('name'))
class BanByeIE(BanByeBaseIE):
_VALID_URL = r'https?://(?:www\.)?banbye.com/(?:en/)?watch/(?P<id>\w+)'
_TESTS = [{
'url': 'https://banbye.com/watch/v_ytfmvkVYLE8T',
'md5': '2f4ea15c5ca259a73d909b2cfd558eb5',
'info_dict': {
'id': 'v_ytfmvkVYLE8T',
'ext': 'mp4',
'title': 'md5:5ec098f88a0d796f987648de6322ba0f',
'description': 'md5:4d94836e73396bc18ef1fa0f43e5a63a',
'uploader': 'wRealu24',
'channel_id': 'ch_wrealu24',
'channel_url': 'https://banbye.com/channel/ch_wrealu24',
'timestamp': 1647604800,
'upload_date': '20220318',
'duration': 1931,
'thumbnail': r're:https?://.*\.webp',
'tags': 'count:5',
'like_count': int,
'dislike_count': int,
'view_count': int,
'comment_count': int,
},
}, {
'url': 'https://banbye.com/watch/v_2JjQtqjKUE_F?playlistId=p_Ld82N6gBw_OJ',
'info_dict': {
'title': 'Krzysztof Karoń',
'id': 'p_Ld82N6gBw_OJ',
},
'playlist_count': 9,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
playlist_id = self._extract_playlist_id(url, 'playlistId')
if self._yes_playlist(playlist_id, video_id):
return self._extract_playlist(playlist_id)
data = self._download_json(f'{self._API_BASE}/videos/{video_id}', video_id)
thumbnails = [{
'id': f'{quality}p',
'url': f'{self._CDN_BASE}/video/{video_id}/{quality}.webp',
} for quality in [48, 96, 144, 240, 512, 1080]]
formats = [{
'format_id': f'http-{quality}p',
'quality': quality,
'url': f'{self._CDN_BASE}/video/{video_id}/{quality}.mp4',
} for quality in data['quality']]
self._sort_formats(formats)
return {
'id': video_id,
'title': data.get('title'),
'description': data.get('desc'),
'uploader': traverse_obj(data, ('channel', 'name')),
'channel_id': data.get('channelId'),
'channel_url': format_field(data, 'channelId', 'https://banbye.com/channel/%s'),
'timestamp': unified_timestamp(data.get('publishedAt')),
'duration': data.get('duration'),
'tags': data.get('tags'),
'formats': formats,
'thumbnails': thumbnails,
'like_count': data.get('likes'),
'dislike_count': data.get('dislikes'),
'view_count': data.get('views'),
'comment_count': data.get('commentCount'),
}
class BanByeChannelIE(BanByeBaseIE):
_VALID_URL = r'https?://(?:www\.)?banbye.com/(?:en/)?channel/(?P<id>\w+)'
_TESTS = [{
'url': 'https://banbye.com/channel/ch_wrealu24',
'info_dict': {
'title': 'wRealu24',
'id': 'ch_wrealu24',
'description': 'md5:da54e48416b74dfdde20a04867c0c2f6',
},
'playlist_mincount': 791,
}, {
'url': 'https://banbye.com/channel/ch_wrealu24?playlist=p_Ld82N6gBw_OJ',
'info_dict': {
'title': 'Krzysztof Karoń',
'id': 'p_Ld82N6gBw_OJ',
},
'playlist_count': 9,
}]
_PAGE_SIZE = 100
def _real_extract(self, url):
channel_id = self._match_id(url)
playlist_id = self._extract_playlist_id(url)
if playlist_id:
return self._extract_playlist(playlist_id)
def page_func(page_num):
data = self._download_json(f'{self._API_BASE}/videos', channel_id, query={
'channelId': channel_id,
'sort': 'new',
'limit': self._PAGE_SIZE,
'offset': page_num * self._PAGE_SIZE,
}, note=f'Downloading page {page_num+1}')
return [
self.url_result(f"{self._VIDEO_BASE}/{video['_id']}", BanByeIE)
for video in data['items']
]
channel_data = self._download_json(f'{self._API_BASE}/channels/{channel_id}', channel_id)
entries = InAdvancePagedList(
page_func,
math.ceil(channel_data['videoCount'] / self._PAGE_SIZE),
self._PAGE_SIZE)
return self.playlist_result(
entries, channel_id, channel_data.get('name'), channel_data.get('description'))

View File

@@ -183,6 +183,7 @@ class BandcampIE(InfoExtractor):
'format_note': f.get('description'),
'filesize': parse_filesize(f.get('size_mb')),
'vcodec': 'none',
'acodec': format_id.split('-')[0],
})
self._sort_formats(formats)
@@ -212,7 +213,7 @@ class BandcampIE(InfoExtractor):
class BandcampAlbumIE(BandcampIE):
IE_NAME = 'Bandcamp:album'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?!/music)(?:/album/(?P<id>[^/?#&]+))?'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com/album/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -257,14 +258,6 @@ class BandcampAlbumIE(BandcampIE):
'id': 'hierophany-of-the-open-grave',
},
'playlist_mincount': 9,
}, {
'url': 'http://dotscale.bandcamp.com',
'info_dict': {
'title': 'Loom',
'id': 'dotscale',
'uploader_id': 'dotscale',
},
'playlist_mincount': 7,
}, {
# with escaped quote in title
'url': 'https://jstrecords.bandcamp.com/album/entropy-ep',
@@ -391,41 +384,63 @@ class BandcampWeeklyIE(BandcampIE):
}
class BandcampMusicIE(InfoExtractor):
_VALID_URL = r'https?://(?P<id>[^/]+)\.bandcamp\.com/music'
class BandcampUserIE(InfoExtractor):
IE_NAME = 'Bandcamp:user'
_VALID_URL = r'https?://(?!www\.)(?P<id>[^.]+)\.bandcamp\.com(?:/music)?/?(?:[#?]|$)'
_TESTS = [{
# Type 1 Bandcamp user page.
'url': 'https://adrianvonziegler.bandcamp.com',
'info_dict': {
'id': 'adrianvonziegler',
'title': 'Discography of adrianvonziegler',
},
'playlist_mincount': 23,
}, {
# Bandcamp user page with only one album
'url': 'http://dotscale.bandcamp.com',
'info_dict': {
'id': 'dotscale',
'title': 'Discography of dotscale'
},
'playlist_count': 1,
}, {
# Type 2 Bandcamp user page.
'url': 'https://nightcallofficial.bandcamp.com',
'info_dict': {
'id': 'nightcallofficial',
'title': 'Discography of nightcallofficial',
},
'playlist_count': 4,
}, {
'url': 'https://steviasphere.bandcamp.com/music',
'playlist_mincount': 47,
'info_dict': {
'id': 'steviasphere',
'title': 'Discography of steviasphere',
},
}, {
'url': 'https://coldworldofficial.bandcamp.com/music',
'playlist_mincount': 10,
'info_dict': {
'id': 'coldworldofficial',
'title': 'Discography of coldworldofficial',
},
}, {
'url': 'https://nuclearwarnowproductions.bandcamp.com/music',
'playlist_mincount': 399,
'info_dict': {
'id': 'nuclearwarnowproductions',
'title': 'Discography of nuclearwarnowproductions',
},
}
]
_TYPE_IE_DICT = {
'album': BandcampAlbumIE.ie_key(),
'track': BandcampIE.ie_key()
}
}]
def _real_extract(self, url):
id = self._match_id(url)
webpage = self._download_webpage(url, id)
items = re.findall(r'href\=\"\/(?P<path>(?P<type>album|track)+/[^\"]+)', webpage)
entries = [
self.url_result(
f'https://{id}.bandcamp.com/{item[0]}',
ie=self._TYPE_IE_DICT[item[1]])
for item in items]
return self.playlist_result(entries, id)
uploader = self._match_id(url)
webpage = self._download_webpage(url, uploader)
discography_data = (re.findall(r'<li data-item-id=["\'][^>]+>\s*<a href=["\']([^"\']+)', webpage)
or re.findall(r'<div[^>]+trackTitle["\'][^"\']+["\']([^"\']+)', webpage))
return self.playlist_from_matches(
discography_data, uploader, f'Discography of {uploader}', getter=lambda x: urljoin(url, x))

View File

@@ -11,6 +11,7 @@ from ..compat import (
compat_etree_Element,
compat_HTTPError,
compat_str,
compat_urllib_error,
compat_urlparse,
)
from ..utils import (
@@ -38,7 +39,7 @@ from ..utils import (
class BBCCoUkIE(InfoExtractor):
IE_NAME = 'bbc.co.uk'
IE_DESC = 'BBC iPlayer'
_ID_REGEX = r'(?:[pbm][\da-z]{7}|w[\da-z]{7,14})'
_ID_REGEX = r'(?:[pbml][\da-z]{7}|w[\da-z]{7,14})'
_VALID_URL = r'''(?x)
https?://
(?:www\.)?bbc\.co\.uk/
@@ -263,11 +264,7 @@ class BBCCoUkIE(InfoExtractor):
'only_matching': True,
}]
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading signin page')
@@ -293,9 +290,6 @@ class BBCCoUkIE(InfoExtractor):
'Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
class MediaSelectionError(Exception):
def __init__(self, id):
self.id = id
@@ -394,9 +388,17 @@ class BBCCoUkIE(InfoExtractor):
formats.extend(self._extract_mpd_formats(
href, programme_id, mpd_id=format_id, fatal=False))
elif transfer_format == 'hls':
formats.extend(self._extract_m3u8_formats(
# TODO: let expected_status be passed into _extract_xxx_formats() instead
try:
fmts = self._extract_m3u8_formats(
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=format_id, fatal=False))
m3u8_id=format_id, fatal=False)
except ExtractorError as e:
if not (isinstance(e.exc_info[1], compat_urllib_error.HTTPError)
and e.exc_info[1].code in (403, 404)):
raise
fmts = []
formats.extend(fmts)
elif transfer_format == 'hds':
formats.extend(self._extract_f4m_formats(
href, programme_id, f4m_id=format_id, fatal=False))
@@ -784,21 +786,33 @@ class BBCIE(BBCCoUkIE):
'timestamp': 1437785037,
'upload_date': '20150725',
},
}, {
# video with window.__INITIAL_DATA__ and value as JSON string
'url': 'https://www.bbc.com/news/av/world-europe-59468682',
'info_dict': {
'id': 'p0b71qth',
'ext': 'mp4',
'title': 'Why France is making this woman a national hero',
'description': 'md5:7affdfab80e9c3a1f976230a1ff4d5e4',
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1638230731,
'upload_date': '20211130',
},
}, {
# single video article embedded with data-media-vpid
'url': 'http://www.bbc.co.uk/sport/rowing/35908187',
'only_matching': True,
}, {
# bbcthreeConfig
'url': 'https://www.bbc.co.uk/bbcthree/clip/73d0bbd0-abc3-4cea-b3c0-cdae21905eb1',
'info_dict': {
'id': 'p06556y7',
'ext': 'mp4',
'title': 'Transfers: Cristiano Ronaldo to Man Utd, Arsenal to spend?',
'description': 'md5:4b7dfd063d5a789a1512e99662be3ddd',
'title': 'Things Not To Say to people that live on council estates',
'description': "From being labelled a 'chav', to the presumption that they're 'scroungers', people who live on council estates encounter all kinds of prejudices and false assumptions about themselves, their families, and their lifestyles. Here, eight people discuss the common statements, misconceptions, and clichés that they're tired of hearing.",
'duration': 360,
'thumbnail': r're:https?://.+/.+\.jpg',
},
'params': {
'skip_download': True,
}
}, {
# window.__PRELOADED_STATE__
'url': 'https://www.bbc.co.uk/radio/play/b0b9z4yl',
@@ -892,9 +906,8 @@ class BBCIE(BBCCoUkIE):
playlist_title = json_ld_info.get('title')
if not playlist_title:
playlist_title = self._og_search_title(
webpage, default=None) or self._html_search_regex(
r'<title>(.+?)</title>', webpage, 'playlist title', default=None)
playlist_title = (self._og_search_title(webpage, default=None)
or self._html_extract_title(webpage, 'playlist title', default=None))
if playlist_title:
playlist_title = re.sub(r'(.+)\s*-\s*BBC.*?$', r'\1', playlist_title).strip()
@@ -1171,9 +1184,16 @@ class BBCIE(BBCCoUkIE):
return self.playlist_result(
entries, playlist_id, playlist_title, playlist_description)
initial_data = self._parse_json(self._search_regex(
r'window\.__INITIAL_DATA__\s*=\s*({.+?});', webpage,
'preload state', default='{}'), playlist_id, fatal=False)
initial_data = self._search_regex(
r'window\.__INITIAL_DATA__\s*=\s*("{.+?}")\s*;', webpage,
'quoted preload state', default=None)
if initial_data is None:
initial_data = self._search_regex(
r'window\.__INITIAL_DATA__\s*=\s*({.+?})\s*;', webpage,
'preload state', default={})
else:
initial_data = self._parse_json(initial_data or '"{}"', playlist_id, fatal=False)
initial_data = self._parse_json(initial_data, playlist_id, fatal=False)
if initial_data:
def parse_media(media):
if not media:
@@ -1214,7 +1234,10 @@ class BBCIE(BBCCoUkIE):
if name == 'media-experience':
parse_media(try_get(resp, lambda x: x['data']['initialItem']['mediaItem'], dict))
elif name == 'article':
for block in (try_get(resp, lambda x: x['data']['blocks'], list) or []):
for block in (try_get(resp,
(lambda x: x['data']['blocks'],
lambda x: x['data']['content']['model']['blocks'],),
list) or []):
if block.get('type') != 'media':
continue
parse_media(block.get('model'))

View File

@@ -1,32 +1,45 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_str,
)
from ..utils import (
int_or_none,
parse_qs,
traverse_obj,
try_get,
unified_timestamp,
)
class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?beeg\.(?:com(?:/video)?)/-?(?P<id>\d+)'
_TESTS = [{
# api/v6 v1
'url': 'http://beeg.com/5416503',
'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'url': 'https://beeg.com/-0983946056129650',
'md5': '51d235147c4627cfce884f844293ff88',
'info_dict': {
'id': '5416503',
'id': '0983946056129650',
'ext': 'mp4',
'title': 'Sultry Striptease',
'description': 'md5:d22219c09da287c14bed3d6c37ce4bc2',
'timestamp': 1391813355,
'upload_date': '20140207',
'duration': 383,
'title': 'sucked cock and fucked in a private plane',
'duration': 927,
'tags': list,
'age_limit': 18,
'upload_date': '20220131',
'timestamp': 1643656455,
'display_id': 2540839,
}
}, {
'url': 'https://beeg.com/-0599050563103750?t=4-861',
'md5': 'bd8b5ea75134f7f07fad63008db2060e',
'info_dict': {
'id': '0599050563103750',
'ext': 'mp4',
'title': 'Bad Relatives',
'duration': 2060,
'tags': list,
'age_limit': 18,
'description': 'md5:b4fc879a58ae6c604f8f259155b7e3b9',
'timestamp': 1643623200,
'display_id': 2569965,
'upload_date': '20220131',
}
}, {
# api/v6 v2
@@ -36,12 +49,6 @@ class BeegIE(InfoExtractor):
# api/v6 v2 w/o t
'url': 'https://beeg.com/1277207756',
'only_matching': True,
}, {
'url': 'https://beeg.porn/video/5416503',
'only_matching': True,
}, {
'url': 'https://beeg.porn/5416503',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -49,68 +56,38 @@ class BeegIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
beeg_version = self._search_regex(
r'beeg_version\s*=\s*([\da-zA-Z_-]+)', webpage, 'beeg version',
default='1546225636701')
if len(video_id) >= 10:
query = {
'v': 2,
}
qs = parse_qs(url)
t = qs.get('t', [''])[0].split('-')
if len(t) > 1:
query.update({
's': t[0],
'e': t[1],
})
else:
query = {'v': 1}
for api_path in ('', 'api.'):
video = self._download_json(
'https://%sbeeg.com/api/v6/%s/video/%s'
% (api_path, beeg_version, video_id), video_id,
fatal=api_path == 'api.', query=query)
if video:
break
'https://store.externulls.com/facts/file/%s' % video_id,
video_id, 'Downloading JSON for %s' % video_id)
fc_facts = video.get('fc_facts')
first_fact = {}
for fact in fc_facts:
if not first_fact or try_get(fact, lambda x: x['id'] < first_fact['id']):
first_fact = fact
resources = traverse_obj(video, ('file', 'hls_resources')) or first_fact.get('hls_resources')
formats = []
for format_id, video_url in video.items():
if not video_url:
for format_id, video_uri in resources.items():
if not video_uri:
continue
height = self._search_regex(
r'^(\d+)[pP]$', format_id, 'height', default=None)
if not height:
continue
formats.append({
'url': self._proto_relative_url(
video_url.replace('{DATA_MARKERS}', 'data=pc_XX__%s_0' % beeg_version), 'https:'),
'format_id': format_id,
'height': int(height),
})
height = int_or_none(self._search_regex(r'fl_cdn_(\d+)', format_id, 'height', default=None))
current_formats = self._extract_m3u8_formats(f'https://video.beeg.com/{video_uri}', video_id, ext='mp4', m3u8_id=str(height))
for f in current_formats:
f['height'] = height
formats.extend(current_formats)
self._sort_formats(formats)
title = video['title']
video_id = compat_str(video.get('id') or video_id)
display_id = video.get('code')
description = video.get('desc')
series = video.get('ps_name')
timestamp = unified_timestamp(video.get('date'))
duration = int_or_none(video.get('duration'))
tags = [tag.strip() for tag in video['tags'].split(',')] if video.get('tags') else None
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'series': series,
'timestamp': timestamp,
'duration': duration,
'tags': tags,
'display_id': first_fact.get('id'),
'title': traverse_obj(video, ('file', 'stuff', 'sf_name')),
'description': traverse_obj(video, ('file', 'stuff', 'sf_story')),
'timestamp': unified_timestamp(first_fact.get('fc_created')),
'duration': int_or_none(traverse_obj(video, ('file', 'fl_duration'))),
'tags': traverse_obj(video, ('tags', ..., 'tg_name')),
'formats': formats,
'age_limit': self._rta_search(webpage),
}

59
yt_dlp/extractor/bigo.py Normal file
View File

@@ -0,0 +1,59 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import ExtractorError, urlencode_postdata
class BigoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bigo\.tv/(?:[a-z]{2,}/)?(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://www.bigo.tv/ja/221338632',
'info_dict': {
'id': '6576287577575737440',
'title': '土よ〜💁‍♂️ 休憩室/REST room',
'thumbnail': r're:https?://.+',
'uploader': '✨Shin💫',
'uploader_id': '221338632',
'is_live': True,
},
'skip': 'livestream',
}, {
'url': 'https://www.bigo.tv/th/Tarlerm1304',
'only_matching': True,
}, {
'url': 'https://bigo.tv/115976881',
'only_matching': True,
}]
def _real_extract(self, url):
user_id = self._match_id(url)
info_raw = self._download_json(
'https://bigo.tv/studio/getInternalStudioInfo',
user_id, data=urlencode_postdata({'siteId': user_id}))
if not isinstance(info_raw, dict):
raise ExtractorError('Received invalid JSON data')
if info_raw.get('code'):
raise ExtractorError(
'Bigo says: %s (code %s)' % (info_raw.get('msg'), info_raw.get('code')), expected=True)
info = info_raw.get('data') or {}
if not info.get('alive'):
raise ExtractorError('This user is offline.', expected=True)
return {
'id': info.get('roomId') or user_id,
'title': info.get('roomTopic') or info.get('nick_name') or user_id,
'formats': [{
'url': info.get('hls_src'),
'ext': 'mp4',
'protocol': 'm3u8',
}],
'thumbnail': info.get('snapshot'),
'uploader': info.get('nick_name'),
'uploader_id': user_id,
'is_live': True,
}

View File

@@ -15,6 +15,7 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
filter_dict,
int_or_none,
float_or_none,
mimetype2ext,
@@ -50,7 +51,7 @@ class BiliBiliIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.bilibili.com/video/av1074402/',
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788',
'md5': '7ac275ec84a99a6552c5d229659a0fe1',
'info_dict': {
'id': '1074402_part1',
'ext': 'mp4',
@@ -60,6 +61,11 @@ class BiliBiliIE(InfoExtractor):
'upload_date': '20140420',
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'timestamp': 1398012678,
'tags': ['顶上去报复社会', '该来的总会来的', '金克拉是检验歌曲的唯一标准', '坷垃教主', '金坷垃', '邓紫棋', '治愈系坷垃'],
'bv_id': 'BV11x411K7CN',
'cid': '1554319',
'thumbnail': 'http://i2.hdslb.com/bfs/archive/c79a8cf0347cd7a897c53a2f756e96aead128e8c.jpg',
'duration': 308.36,
},
}, {
# Tested in BiliBiliBangumiIE
@@ -90,6 +96,11 @@ class BiliBiliIE(InfoExtractor):
'timestamp': 1488382634,
'uploader_id': '65880958',
'uploader': '阿滴英文',
'thumbnail': 'http://i2.hdslb.com/bfs/archive/49267ce20bc246be6304bf369a3ded0256854c23.jpg',
'cid': '14694589',
'duration': 554.117,
'bv_id': 'BV13x41117TL',
'tags': ['人文', '英语', '文化', '公开课', '阿滴英文'],
},
'params': {
'skip_download': True,
@@ -106,6 +117,27 @@ class BiliBiliIE(InfoExtractor):
'title': '物语中的人物是如何吐槽自己的OP的'
},
'playlist_count': 17,
}, {
# Correct matching of single and double quotes in title
'url': 'https://www.bilibili.com/video/BV1NY411E7Rx/',
'info_dict': {
'id': '255513412_part1',
'ext': 'mp4',
'title': 'Vid"eo" Te\'st',
'cid': '570602418',
'thumbnail': 'http://i2.hdslb.com/bfs/archive/0c0de5a90b6d5b991b8dcc6cde0afbf71d564791.jpg',
'upload_date': '20220408',
'timestamp': 1649436552,
'description': 'Vid"eo" Te\'st',
'uploader_id': '1630758804',
'bv_id': 'BV1NY411E7Rx',
'duration': 60.394,
'uploader': 'bili_31244483705',
'tags': ['VLOG'],
},
'params': {
'skip_download': True,
},
}]
_APP_KEY = 'iVGUTjsxvpLeuDCf'
@@ -225,10 +257,6 @@ class BiliBiliIE(InfoExtractor):
'quality': -2 if 'hd.mp4' in backup_url else -3,
})
for a_format in formats:
a_format.setdefault('http_headers', {}).update({
'Referer': url,
})
for audio in audios:
formats.append({
'url': audio.get('baseUrl') or audio.get('base_url') or audio.get('url'),
@@ -252,13 +280,17 @@ class BiliBiliIE(InfoExtractor):
'id': video_id,
'duration': float_or_none(durl.get('length'), 1000),
'formats': formats,
'http_headers': {
'Referer': url,
},
})
break
self._sort_formats(formats)
title = self._html_search_regex((
r'<h1[^>]+title=(["\'])(?P<content>[^"\']+)',
r'<h1[^>]+title=(["])(?P<content>[^"]+)',
r'<h1[^>]+title=([\'])(?P<content>[^\']+)',
r'(?s)<h1[^>]*>(?P<content>.+?)</h1>',
self._meta_regex('title')
), webpage, 'title', group='content', fatal=False)
@@ -756,15 +788,21 @@ class BiliIntlBaseIE(InfoExtractor):
for i, line in enumerate(json['body']) if line.get('content'))
return data
def _get_subtitles(self, ep_id):
sub_json = self._call_api(f'/web/v2/subtitle?episode_id={ep_id}&platform=web', ep_id)
def _get_subtitles(self, *, ep_id=None, aid=None):
sub_json = self._call_api(
'/web/v2/subtitle', ep_id or aid, note='Downloading subtitles list',
errnote='Unable to download subtitles list', query=filter_dict({
'platform': 'web',
'episode_id': ep_id,
'aid': aid,
}))
subtitles = {}
for sub in sub_json.get('subtitles') or []:
sub_url = sub.get('url')
if not sub_url:
continue
sub_data = self._download_json(
sub_url, ep_id, errnote='Unable to download subtitles', fatal=False,
sub_url, ep_id or aid, errnote='Unable to download subtitles', fatal=False,
note='Downloading subtitles%s' % f' for {sub["lang"]}' if sub.get('lang') else '')
if not sub_data:
continue
@@ -774,9 +812,14 @@ class BiliIntlBaseIE(InfoExtractor):
})
return subtitles
def _get_formats(self, ep_id):
video_json = self._call_api(f'/web/playurl?ep_id={ep_id}&platform=web', ep_id,
note='Downloading video formats', errnote='Unable to download video formats')
def _get_formats(self, *, ep_id=None, aid=None):
video_json = self._call_api(
'/web/playurl', ep_id or aid, note='Downloading video formats',
errnote='Unable to download video formats', query=filter_dict({
'platform': 'web',
'ep_id': ep_id,
'aid': aid,
}))
video_json = video_json['playurl']
formats = []
for vid in video_json.get('video') or []:
@@ -810,23 +853,19 @@ class BiliIntlBaseIE(InfoExtractor):
self._sort_formats(formats)
return formats
def _extract_ep_info(self, episode_data, ep_id):
def _extract_video_info(self, video_data, *, ep_id=None, aid=None):
return {
'id': ep_id,
'title': episode_data.get('title_display') or episode_data['title'],
'thumbnail': episode_data.get('cover'),
'id': ep_id or aid,
'title': video_data.get('title_display') or video_data.get('title'),
'thumbnail': video_data.get('cover'),
'episode_number': int_or_none(self._search_regex(
r'^E(\d+)(?:$| - )', episode_data.get('title_display'), 'episode number', default=None)),
'formats': self._get_formats(ep_id),
'subtitles': self._get_subtitles(ep_id),
r'^E(\d+)(?:$| - )', video_data.get('title_display') or '', 'episode number', default=None)),
'formats': self._get_formats(ep_id=ep_id, aid=aid),
'subtitles': self._get_subtitles(ep_id=ep_id, aid=aid),
'extractor_key': BiliIntlIE.ie_key(),
}
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
try:
from Cryptodome.PublicKey import RSA
from Cryptodome.Cipher import PKCS1_v1_5
@@ -857,12 +896,9 @@ class BiliIntlBaseIE(InfoExtractor):
else:
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
class BiliIntlIE(BiliIntlBaseIE):
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-z]{2}/)?play/(?P<season_id>\d+)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-z]{2}/)?(play/(?P<season_id>\d+)/(?P<ep_id>\d+)|video/(?P<aid>\d+))'
_TESTS = [{
# Bstation page
'url': 'https://www.bilibili.tv/en/play/34613/341736',
@@ -897,24 +933,35 @@ class BiliIntlIE(BiliIntlBaseIE):
}, {
'url': 'https://www.biliintl.com/en/play/34613/341736',
'only_matching': True,
}, {
# User-generated content (as opposed to a series licensed from a studio)
'url': 'https://bilibili.tv/en/video/2019955076',
'only_matching': True,
}, {
# No language in URL
'url': 'https://www.bilibili.tv/video/2019955076',
'only_matching': True,
}]
def _real_extract(self, url):
season_id, video_id = self._match_valid_url(url).groups()
season_id, ep_id, aid = self._match_valid_url(url).group('season_id', 'ep_id', 'aid')
video_id = ep_id or aid
webpage = self._download_webpage(url, video_id)
# Bstation layout
initial_data = self._parse_json(self._search_regex(
r'window\.__INITIAL_DATA__\s*=\s*({.+?});', webpage,
r'window\.__INITIAL_(?:DATA|STATE)__\s*=\s*({.+?});', webpage,
'preload state', default='{}'), video_id, fatal=False) or {}
episode_data = traverse_obj(initial_data, ('OgvVideo', 'epDetail'), expected_type=dict)
video_data = (
traverse_obj(initial_data, ('OgvVideo', 'epDetail'), expected_type=dict)
or traverse_obj(initial_data, ('UgcVideo', 'videoData'), expected_type=dict) or {})
if not episode_data:
if season_id and not video_data:
# Non-Bstation layout, read through episode list
season_json = self._call_api(f'/web/v2/ogv/play/episodes?season_id={season_id}&platform=web', video_id)
episode_data = next(
episode for episode in traverse_obj(season_json, ('sections', ..., 'episodes', ...), expected_type=dict)
if str(episode.get('episode_id')) == video_id)
return self._extract_ep_info(episode_data, video_id)
video_data = traverse_obj(season_json,
('sections', ..., 'episodes', lambda _, v: str(v['episode_id']) == ep_id),
expected_type=dict, get_all=False)
return self._extract_video_info(video_data, ep_id=ep_id, aid=aid)
class BiliIntlSeriesIE(BiliIntlBaseIE):
@@ -942,7 +989,7 @@ class BiliIntlSeriesIE(BiliIntlBaseIE):
series_json = self._call_api(f'/web/v2/ogv/play/episodes?season_id={series_id}&platform=web', series_id)
for episode in traverse_obj(series_json, ('sections', ..., 'episodes', ...), expected_type=dict, default=[]):
episode_id = str(episode.get('episode_id'))
yield self._extract_ep_info(episode, episode_id)
yield self._extract_video_info(episode, ep_id=episode_id)
def _real_extract(self, url):
series_id = self._match_id(url)

View File

@@ -3,27 +3,28 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from .vk import VKIE
from ..compat import (
compat_b64decode,
compat_urllib_parse_unquote,
from ..compat import compat_b64decode
from ..utils import (
int_or_none,
js_to_json,
traverse_obj,
unified_timestamp,
)
from ..utils import int_or_none
class BIQLEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{
# Youtube embed
'url': 'https://biqle.ru/watch/-115995369_456239081',
'md5': '97af5a06ee4c29bbf9c001bdb1cf5c06',
'url': 'https://biqle.ru/watch/-2000421746_85421746',
'md5': 'ae6ef4f04d19ac84e4658046d02c151c',
'info_dict': {
'id': '8v4f-avW-VI',
'id': '-2000421746_85421746',
'ext': 'mp4',
'title': "PASSE-PARTOUT - L'ete c'est fait pour jouer",
'description': 'Passe-Partout',
'uploader_id': 'mrsimpsonstef3',
'uploader': 'Phanolito',
'upload_date': '20120822',
'title': 'Forsaken By Hope Studio Clip',
'description': 'Forsaken By Hope Studio Clip — Смотреть онлайн',
'upload_date': '19700101',
'thumbnail': r're:https://[^/]+/impf/7vN3ACwSTgChP96OdOfzFjUCzFR6ZglDQgWsIw/KPaACiVJJxM\.jpg\?size=800x450&quality=96&keep_aspect_ratio=1&background=000000&sign=b48ea459c4d33dbcba5e26d63574b1cb&type=video_thumb',
'timestamp': 0,
},
}, {
'url': 'http://biqle.org/watch/-44781847_168547604',
@@ -32,50 +33,59 @@ class BIQLEIE(InfoExtractor):
'id': '-44781847_168547604',
'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки',
'description': 'Ребенок в шоке от автоматической мойки — Смотреть онлайн',
'timestamp': 1396633454,
'uploader': 'Dmitry Kotov',
'upload_date': '20140404',
'uploader_id': '47850140',
'thumbnail': r're:https://[^/]+/c535507/u190034692/video/l_b84df002\.jpg',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
embed_url = self._proto_relative_url(self._search_regex(
r'<iframe.+?src="((?:https?:)?//(?:daxab\.com|dxb\.to|[^/]+/player)/[^"]+)".*?></iframe>',
webpage, 'embed url'))
title = self._html_search_meta('name', webpage, 'Title', fatal=False)
timestamp = unified_timestamp(self._html_search_meta('uploadDate', webpage, 'Upload Date', default=None))
description = self._html_search_meta('description', webpage, 'Description', default=None)
global_embed_url = self._search_regex(
r'<script[^<]+?window.globEmbedUrl\s*=\s*\'((?:https?:)?//(?:daxab\.com|dxb\.to|[^/]+/player)/[^\']+)\'',
webpage, 'global Embed url')
hash = self._search_regex(
r'<script id="data-embed-video[^<]+?hash: "([^"]+)"[^<]*</script>', webpage, 'Hash')
embed_url = global_embed_url + hash
if VKIE.suitable(embed_url):
return self.url_result(embed_url, VKIE.ie_key(), video_id)
embed_page = self._download_webpage(
embed_url, video_id, headers={'Referer': url})
video_ext = self._get_cookies(embed_url).get('video_ext')
if video_ext:
video_ext = compat_urllib_parse_unquote(video_ext.value)
if not video_ext:
video_ext = compat_b64decode(self._search_regex(
r'video_ext\s*:\s*[\'"]([A-Za-z0-9+/=]+)',
embed_page, 'video_ext')).decode()
video_id, sig, _, access_token = video_ext.split(':')
embed_url, video_id, 'Downloading embed webpage', headers={'Referer': url})
glob_params = self._parse_json(self._search_regex(
r'<script id="globParams">[^<]*window.globParams = ([^;]+);[^<]+</script>',
embed_page, 'Global Parameters'), video_id, transform_source=js_to_json)
host_name = compat_b64decode(glob_params['server'][::-1]).decode()
item = self._download_json(
'https://api.vk.com/method/video.get', video_id,
headers={'User-Agent': 'okhttp/3.4.1'}, query={
'access_token': access_token,
'sig': sig,
'v': 5.44,
f'https://{host_name}/method/video.get/{video_id}', video_id,
headers={'Referer': url}, query={
'token': glob_params['video']['access_token'],
'videos': video_id,
'ckey': glob_params['c_key'],
'credentials': glob_params['video']['credentials'],
})['response']['items'][0]
title = item['title']
formats = []
for f_id, f_url in item.get('files', {}).items():
if f_id == 'external':
return self.url_result(f_url)
ext, height = f_id.split('_')
height_extra_key = traverse_obj(glob_params, ('video', 'partial', 'quality', height))
if height_extra_key:
formats.append({
'format_id': height + 'p',
'url': f_url,
'format_id': f'{height}p',
'url': f'https://{host_name}/{f_url[8:]}&videos={video_id}&extra_key={height_extra_key}',
'height': int_or_none(height),
'ext': ext,
})
@@ -96,10 +106,9 @@ class BIQLEIE(InfoExtractor):
'title': title,
'formats': formats,
'comment_count': int_or_none(item.get('comments')),
'description': item.get('description'),
'description': description,
'duration': int_or_none(item.get('duration')),
'thumbnails': thumbnails,
'timestamp': int_or_none(item.get('date')),
'uploader': item.get('owner_id'),
'timestamp': timestamp,
'view_count': int_or_none(item.get('views')),
}

View File

@@ -175,7 +175,7 @@ class BRIE(InfoExtractor):
class BRMediathekIE(InfoExtractor):
IE_DESC = 'Bayerischer Rundfunk Mediathek'
_VALID_URL = r'https?://(?:www\.)?br\.de/mediathek/video/[^/?&#]*?-(?P<id>av:[0-9a-f]{24})'
_VALID_URL = r'https?://(?:www\.)?br\.de/mediathek//?video/(?:[^/?&#]+?-)?(?P<id>av:[0-9a-f]{24})'
_TESTS = [{
'url': 'https://www.br.de/mediathek/video/gesundheit-die-sendung-vom-28112017-av:5a1e6a6e8fce6d001871cc8e',
@@ -188,6 +188,9 @@ class BRMediathekIE(InfoExtractor):
'timestamp': 1511942766,
'upload_date': '20171129',
}
}, {
'url': 'https://www.br.de/mediathek//video/av:61b0db581aed360007558c12',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -29,9 +29,8 @@ class BreitBartIE(InfoExtractor):
self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(
webpage, default=None) or self._html_search_regex(
r'(?s)<title>(.*?)</title>', webpage, 'video title'),
'title': (self._og_search_title(webpage, default=None)
or self._html_extract_title(webpage, 'video title')),
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'age_limit': self._rta_search(webpage),

View File

@@ -54,7 +54,7 @@ class CallinIE(InfoExtractor):
id = episode['id']
title = (episode.get('title')
or self._og_search_title(webpage, fatal=False)
or self._html_search_regex('<title>(.*?)</title>', webpage, 'title'))
or self._html_extract_title(webpage))
url = episode['m3u8']
formats = self._extract_m3u8_formats(url, display_id, ext='ts')
self._sort_formats(formats)

View File

@@ -0,0 +1,41 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class CaltransIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?ca\.gov/vm/loc/[^/]+/(?P<id>[a-z0-9_]+)\.htm'
_TEST = {
'url': 'https://cwwp2.dot.ca.gov/vm/loc/d3/hwy50at24th.htm',
'info_dict': {
'id': 'hwy50at24th',
'ext': 'ts',
'title': 'US-50 : Sacramento : Hwy 50 at 24th',
'live_status': 'is_live',
'thumbnail': 'https://cwwp2.dot.ca.gov/data/d3/cctv/image/hwy50at24th/hwy50at24th.jpg',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
global_vars = self._search_regex(
r'<script[^<]+?([^<]+\.m3u8[^<]+)</script>',
webpage, 'Global Vars')
route_place = self._search_regex(r'routePlace\s*=\s*"([^"]+)"', global_vars, 'Route Place', fatal=False)
location_name = self._search_regex(r'locationName\s*=\s*"([^"]+)"', global_vars, 'Location Name', fatal=False)
poster_url = self._search_regex(r'posterURL\s*=\s*"([^"]+)"', global_vars, 'Poster Url', fatal=False)
video_stream = self._search_regex(r'videoStreamURL\s*=\s*"([^"]+)"', global_vars, 'Video Stream URL', fatal=False)
formats = self._extract_m3u8_formats(video_stream, video_id, 'ts', live=True)
self._sort_formats(formats)
return {
'id': video_id,
'title': f'{route_place} : {location_name}',
'is_live': True,
'formats': formats,
'thumbnail': poster_url,
}

View File

@@ -245,10 +245,6 @@ class VrtNUIE(GigyaBaseIE):
'upload_date': '20200727',
},
'skip': 'This video is only available for registered users',
'params': {
'username': '<snip>',
'password': '<snip>',
},
'expected_warnings': ['is not a supported codec'],
}, {
# Only available via new API endpoint
@@ -264,24 +260,13 @@ class VrtNUIE(GigyaBaseIE):
'episode_number': 5,
},
'skip': 'This video is only available for registered users',
'params': {
'username': '<snip>',
'password': '<snip>',
},
'expected_warnings': ['Unable to download asset JSON', 'is not a supported codec', 'Unknown MIME type'],
}]
_NETRC_MACHINE = 'vrtnu'
_APIKEY = '3_0Z2HujMtiWq_pkAjgnS2Md2E11a1AwZjYiBETtwNE-EoEHDINgtnvcAOpNgmrVGy'
_CONTEXT_ID = 'R3595707040'
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
def _perform_login(self, username, password):
auth_info = self._gigya_login({
'APIKey': self._APIKEY,
'targetEnv': 'jssdk',

View File

@@ -127,9 +127,9 @@ class CBCIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._og_search_title(webpage, default=None) or self._html_search_meta(
'twitter:title', webpage, 'title', default=None) or self._html_search_regex(
r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
title = (self._og_search_title(webpage, default=None)
or self._html_search_meta('twitter:title', webpage, 'title', default=None)
or self._html_extract_title(webpage))
entries = [
self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]

View File

@@ -77,21 +77,21 @@ class CBSIE(CBSBaseIE):
(?:
cbs:|
https?://(?:www\.)?(?:
cbs\.com/(?:shows/[^/]+/video|movies/[^/]+)/|
cbs\.com/(?:shows|movies)/(?:video|[^/]+/video|[^/]+)/|
colbertlateshow\.com/(?:video|podcasts)/)
)(?P<id>[\w-]+)'''
# All tests are blocked outside US
_TESTS = [{
'url': 'https://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
'url': 'https://www.cbs.com/shows/video/xrUyNLtl9wd8D_RWWAg9NU2F_V6QpB3R/',
'info_dict': {
'id': '_u7W953k6la293J7EPTd9oHkSPs6Xn6_',
'id': 'xrUyNLtl9wd8D_RWWAg9NU2F_V6QpB3R',
'ext': 'mp4',
'title': 'Connect Chat feat. Garth Brooks',
'description': 'Connect with country music singer Garth Brooks, as he chats with fans on Wednesday November 27, 2013. Be sure to tune in to Garth Brooks: Live from Las Vegas, Friday November 29, at 9/8c on CBS!',
'duration': 1495,
'timestamp': 1385585425,
'upload_date': '20131127',
'title': 'Tough As Nails - Dreams Never Die',
'description': 'md5:a3535a62531cdd52b0364248a2c1ae33',
'duration': 2588,
'timestamp': 1639015200,
'upload_date': '20211209',
'uploader': 'CBSI-NEW',
},
'params': {
@@ -99,14 +99,14 @@ class CBSIE(CBSBaseIE):
'skip_download': True,
},
}, {
'url': 'https://www.cbs.com/shows/the-late-show-with-stephen-colbert/video/60icOhMb9NcjbcWnF_gub9XXHdeBcNk2/the-late-show-6-23-21-christine-baranski-joy-oladokun-',
'url': 'https://www.cbs.com/shows/video/sZH1MGgomIosZgxGJ1l263MFq16oMtW1/',
'info_dict': {
'id': '60icOhMb9NcjbcWnF_gub9XXHdeBcNk2',
'title': 'The Late Show - 6/23/21 (Christine Baranski, Joy Oladokun)',
'timestamp': 1624507140,
'description': 'md5:e01af24e95c74d55e8775aef86117b95',
'id': 'sZH1MGgomIosZgxGJ1l263MFq16oMtW1',
'title': 'The Late Show - 3/16/22 (Michael Buble, Rose Matafeo)',
'timestamp': 1647488100,
'description': 'md5:d0e6ec23c544b7fa8e39a8e6844d2439',
'uploader': 'CBSI-NEW',
'upload_date': '20210624',
'upload_date': '20220317',
},
'params': {
'ignore_no_formats_error': True,

View File

@@ -1,17 +1,14 @@
# coding: utf-8
from __future__ import unicode_literals
import calendar
import datetime
from .common import InfoExtractor
from ..utils import (
clean_html,
extract_timezone,
int_or_none,
parse_duration,
parse_resolution,
try_get,
unified_timestamp,
url_or_none,
)
@@ -95,14 +92,8 @@ class CCMAIE(InfoExtractor):
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
tematica = try_get(informacio, lambda x: x['tematica']['text'])
timestamp = None
data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
try:
timezone, data_utc = extract_timezone(data_utc)
timestamp = calendar.timegm((datetime.datetime.strptime(
data_utc, '%Y-%d-%mT%H:%M:%S') - timezone).timetuple())
except TypeError:
pass
timestamp = unified_timestamp(data_utc)
subtitles = {}
subtitols = media.get('subtitols') or []

View File

@@ -54,8 +54,7 @@ class CloserToTruthIE(InfoExtractor):
r'<script[^>]+src=["\'].*?\b(?:partner_id|p)/(\d+)',
webpage, 'kaltura partner_id')
title = self._search_regex(
r'<title>(.+?)\s*\|\s*.+?</title>', webpage, 'video title')
title = self._html_extract_title(webpage, 'video title')
select = self._search_regex(
r'(?s)<select[^>]+id="select-version"[^>]*>(.+?)</select>',

View File

@@ -23,6 +23,7 @@ from ..compat import (
compat_getpass,
compat_http_client,
compat_os_name,
compat_Pattern,
compat_str,
compat_urllib_error,
compat_urllib_parse_unquote,
@@ -41,7 +42,6 @@ from ..utils import (
base_url,
bug_reports_message,
clean_html,
compiled_regex_type,
determine_ext,
determine_protocol,
dict_get,
@@ -49,6 +49,7 @@ from ..utils import (
error_to_compat_str,
extract_attributes,
ExtractorError,
filter_dict,
fix_xml_ampersands,
float_or_none,
format_field,
@@ -75,6 +76,7 @@ from ..utils import (
str_to_int,
strip_or_none,
traverse_obj,
try_get,
unescapeHTML,
UnsupportedError,
unified_strdate,
@@ -137,6 +139,8 @@ class InfoExtractor(object):
for HDS - URL of the F4M manifest,
for DASH - URL of the MPD manifest,
for MSS - URL of the ISM manifest.
* manifest_stream_number (For internal use only)
The index of the stream in the manifest file
* ext Will be calculated from URL if missing
* format A human-readable description of the format
("mp4 container with h264/opus").
@@ -213,7 +217,7 @@ class InfoExtractor(object):
(HTTP or RTMP) download. Boolean.
* has_drm The format has DRM and cannot be downloaded. Boolean
* downloader_options A dictionary of downloader options as
described in FileDownloader
described in FileDownloader (For internal use only)
RTMP formats can also have the additional fields: page_url,
app, play_path, tc_url, flash_version, rtmp_live, rtmp_conn,
rtmp_protocol, rtmp_real_time
@@ -225,6 +229,7 @@ class InfoExtractor(object):
The following fields are optional:
direct: True if a direct video file was given (must only be set by GenericIE)
alt_title: A secondary title of the video.
display_id An alternative identifier for the video, not necessarily
unique, but available before title. Typically, id is
@@ -239,20 +244,21 @@ class InfoExtractor(object):
* "resolution" (optional, string "{width}x{height}",
deprecated)
* "filesize" (optional, int)
* "http_headers" (dict) - HTTP headers for the request
thumbnail: Full URL to a video thumbnail image.
description: Full video description.
uploader: Full name of the video uploader.
license: License name the video is licensed under.
creator: The creator of the video.
timestamp: UNIX timestamp of the moment the video was uploaded
upload_date: Video upload date (YYYYMMDD).
upload_date: Video upload date in UTC (YYYYMMDD).
If not explicitly set, calculated from timestamp
release_timestamp: UNIX timestamp of the moment the video was released.
If it is not clear whether to use timestamp or this, use the former
release_date: The date (YYYYMMDD) when the video was released.
release_date: The date (YYYYMMDD) when the video was released in UTC.
If not explicitly set, calculated from release_timestamp
modified_timestamp: UNIX timestamp of the moment the video was last modified.
modified_date: The date (YYYYMMDD) when the video was last modified.
modified_date: The date (YYYYMMDD) when the video was last modified in UTC.
If not explicitly set, calculated from modified_timestamp
uploader_id: Nickname or id of the video uploader.
uploader_url: Full URL to a personal webpage of the video uploader.
@@ -272,6 +278,8 @@ class InfoExtractor(object):
* "url": A URL pointing to the subtitles file
It can optionally also have:
* "name": Name or description of the subtitles
* "http_headers": A dictionary of additional HTTP headers
to add to the request.
"ext" will be calculated from URL if missing
automatic_captions: Like 'subtitles'; contains automatically generated
captions instead of normal subtitles
@@ -421,13 +429,21 @@ class InfoExtractor(object):
title, description etc.
Subclasses of this one should re-define the _real_initialize() and
_real_extract() methods and define a _VALID_URL regexp.
Subclasses of this should define a _VALID_URL regexp and, re-define the
_real_extract() and (optionally) _real_initialize() methods.
Probably, they should also be added to the list of extractors.
Subclasses may also override suitable() if necessary, but ensure the function
signature is preserved and that this function imports everything it needs
(except other extractors), so that lazy_extractors works correctly
(except other extractors), so that lazy_extractors works correctly.
To support username + password (or netrc) login, the extractor must define a
_NETRC_MACHINE and re-define _perform_login(username, password) and
(optionally) _initialize_pre_login() methods. The _perform_login method will
be called between _initialize_pre_login and _real_initialize if credentials
are passed by the user. In cases where it is necessary to have the login
process as part of the extraction rather than initialization, _perform_login
can be left undefined.
_GEO_BYPASS attribute may be set to False in order to disable
geo restriction bypass mechanisms for a particular extractor.
@@ -455,9 +471,11 @@ class InfoExtractor(object):
_GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None
_WORKING = True
_NETRC_MACHINE = None
IE_DESC = None
_LOGIN_HINTS = {
'any': 'Use --cookies, --username and --password, or --netrc to provide account credentials',
'any': 'Use --cookies, --cookies-from-browser, --username and --password, or --netrc to provide account credentials',
'cookies': (
'Use --cookies-from-browser or --cookies for the authentication. '
'See https://github.com/ytdl-org/youtube-dl#how-do-i-pass-cookies-to-youtube-dl for how to manually pass cookies'),
@@ -507,6 +525,10 @@ class InfoExtractor(object):
"""Getter method for _WORKING."""
return cls._WORKING
@classmethod
def supports_login(cls):
return bool(cls._NETRC_MACHINE)
def initialize(self):
"""Initializes an instance (authentication, etc)."""
self._printed_messages = set()
@@ -515,6 +537,13 @@ class InfoExtractor(object):
'ip_blocks': self._GEO_IP_BLOCKS,
})
if not self._ready:
self._initialize_pre_login()
if self.supports_login():
username, password = self._get_login_info()
if username:
self._perform_login(username, password)
elif self.get_param('username') and False not in (self.IE_DESC, self._NETRC_MACHINE):
self.report_warning(f'Login with password is not supported for this website. {self._LOGIN_HINTS["cookies"]}')
self._real_initialize()
self._ready = True
@@ -635,7 +664,7 @@ class InfoExtractor(object):
}
if hasattr(e, 'countries'):
kwargs['countries'] = e.countries
raise type(e)(e.msg, **kwargs)
raise type(e)(e.orig_msg, **kwargs)
except compat_http_client.IncompleteRead as e:
raise ExtractorError('A network error has occurred.', cause=e, expected=True, video_id=self.get_temp_id(url))
except (KeyError, StopIteration) as e:
@@ -657,16 +686,24 @@ class InfoExtractor(object):
return False
def set_downloader(self, downloader):
"""Sets the downloader for this IE."""
"""Sets a YoutubeDL instance as the downloader for this IE."""
self._downloader = downloader
def _initialize_pre_login(self):
""" Intialization before login. Redefine in subclasses."""
pass
def _perform_login(self, username, password):
""" Login with username and password. Redefine in subclasses."""
pass
def _real_initialize(self):
"""Real initialization process. Redefine in subclasses."""
pass
def _real_extract(self, url):
"""Real extraction process. Redefine in subclasses."""
pass
raise NotImplementedError('This method must be implemented by subclasses')
@classmethod
def ie_key(cls):
@@ -745,7 +782,7 @@ class InfoExtractor(object):
errmsg = '%s: %s' % (errnote, error_to_compat_str(err))
if fatal:
raise ExtractorError(errmsg, sys.exc_info()[2], cause=err)
raise ExtractorError(errmsg, cause=err)
else:
self.report_warning(errmsg)
return False
@@ -1000,7 +1037,7 @@ class InfoExtractor(object):
if transform_source:
json_string = transform_source(json_string)
try:
return json.loads(json_string)
return json.loads(json_string, strict=False)
except ValueError as ve:
errmsg = '%s: Failed to parse JSON ' % video_id
if fatal:
@@ -1093,11 +1130,15 @@ class InfoExtractor(object):
def raise_login_required(
self, msg='This video is only available for registered users',
metadata_available=False, method='any'):
metadata_available=False, method=NO_DEFAULT):
if metadata_available and (
self.get_param('ignore_no_formats_error') or self.get_param('wait_for_video')):
self.report_warning(msg)
return
if method is NO_DEFAULT:
method = 'any' if self.supports_login() else 'cookies'
if method is not None:
assert method in self._LOGIN_HINTS, 'Invalid login method'
msg = '%s. %s' % (msg, self._LOGIN_HINTS[method])
raise ExtractorError(msg, expected=True)
@@ -1135,8 +1176,8 @@ class InfoExtractor(object):
'url': url,
}
def playlist_from_matches(self, matches, playlist_id=None, playlist_title=None, getter=None, ie=None, **kwargs):
urls = (self.url_result(self._proto_relative_url(m), ie)
def playlist_from_matches(self, matches, playlist_id=None, playlist_title=None, getter=None, ie=None, video_kwargs=None, **kwargs):
urls = (self.url_result(self._proto_relative_url(m), ie, **(video_kwargs or {}))
for m in orderedSet(map(getter, matches) if getter else matches))
return self.playlist_result(urls, playlist_id, playlist_title, **kwargs)
@@ -1162,7 +1203,9 @@ class InfoExtractor(object):
In case of failure return a default value or raise a WARNING or a
RegexNotFoundError, depending on fatal, specifying the field name.
"""
if isinstance(pattern, (str, compat_str, compiled_regex_type)):
if string is None:
mobj = None
elif isinstance(pattern, (str, compat_Pattern)):
mobj = re.search(pattern, string, flags)
else:
for p in pattern:
@@ -1258,8 +1301,8 @@ class InfoExtractor(object):
@staticmethod
def _og_regexes(prop):
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?))'
property_re = (r'(?:name|property)=(?:\'og[:-]%(prop)s\'|"og[:-]%(prop)s"|\s*og[:-]%(prop)s\b)'
% {'prop': re.escape(prop)})
property_re = (r'(?:name|property)=(?:\'og%(sep)s%(prop)s\'|"og%(sep)s%(prop)s"|\s*og%(sep)s%(prop)s\b)'
% {'prop': re.escape(prop), 'sep': '(?:&#x3A;|[:-])'})
template = r'<meta[^>]+?%s[^>]+?%s'
return [
template % (property_re, content_re),
@@ -1290,9 +1333,8 @@ class InfoExtractor(object):
def _og_search_description(self, html, **kargs):
return self._og_search_property('description', html, fatal=False, **kargs)
def _og_search_title(self, html, **kargs):
kargs.setdefault('fatal', False)
return self._og_search_property('title', html, **kargs)
def _og_search_title(self, html, *, fatal=False, **kargs):
return self._og_search_property('title', html, fatal=fatal, **kargs)
def _og_search_video_url(self, html, name='video url', secure=True, **kargs):
regexes = self._og_regexes('video') + self._og_regexes('video:url')
@@ -1303,6 +1345,9 @@ class InfoExtractor(object):
def _og_search_url(self, html, **kargs):
return self._og_search_property('url', html, **kargs)
def _html_extract_title(self, html, name='title', *, fatal=False, **kwargs):
return self._html_search_regex(r'(?s)<title>([^<]+)</title>', html, name, fatal=fatal, **kwargs)
def _html_search_meta(self, name, html, display_name=None, fatal=False, **kwargs):
name = variadic(name)
if display_name is None:
@@ -1546,7 +1591,7 @@ class InfoExtractor(object):
break
traverse_json_ld(json_ld)
return dict((k, v) for k, v in info.items() if v is not None)
return filter_dict(info)
def _search_nextjs_data(self, webpage, video_id, *, transform_source=None, fatal=True, **kw):
return self._parse_json(
@@ -1609,7 +1654,7 @@ class InfoExtractor(object):
'vcodec': {'type': 'ordered', 'regex': True,
'order': ['av0?1', 'vp0?9.2', 'vp0?9', '[hx]265|he?vc?', '[hx]264|avc', 'vp0?8', 'mp4v|h263', 'theora', '', None, 'none']},
'acodec': {'type': 'ordered', 'regex': True,
'order': ['[af]lac', 'wav|aiff', 'opus', 'vorbis', 'aac', 'mp?4a?', 'mp3', 'e-?a?c-?3', 'ac-?3', 'dts', '', None, 'none']},
'order': ['[af]lac', 'wav|aiff', 'opus', 'vorbis|ogg', 'aac', 'mp?4a?', 'mp3', 'e-?a?c-?3', 'ac-?3', 'dts', '', None, 'none']},
'hdr': {'type': 'ordered', 'regex': True, 'field': 'dynamic_range',
'order': ['dv', '(hdr)?12', r'(hdr)?10\+', '(hdr)?10', 'hlg', '', 'sdr', None]},
'proto': {'type': 'ordered', 'regex': True, 'field': 'protocol',
@@ -1652,31 +1697,31 @@ class InfoExtractor(object):
'format_id': {'type': 'alias', 'field': 'id'},
'preference': {'type': 'alias', 'field': 'ie_pref'},
'language_preference': {'type': 'alias', 'field': 'lang'},
'source_preference': {'type': 'alias', 'field': 'source'},
'protocol': {'type': 'alias', 'field': 'proto'},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
# Deprecated
'dimension': {'type': 'alias', 'field': 'res'},
'resolution': {'type': 'alias', 'field': 'res'},
'extension': {'type': 'alias', 'field': 'ext'},
'bitrate': {'type': 'alias', 'field': 'br'},
'total_bitrate': {'type': 'alias', 'field': 'tbr'},
'video_bitrate': {'type': 'alias', 'field': 'vbr'},
'audio_bitrate': {'type': 'alias', 'field': 'abr'},
'framerate': {'type': 'alias', 'field': 'fps'},
'protocol': {'type': 'alias', 'field': 'proto'},
'source_preference': {'type': 'alias', 'field': 'source'},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
'filesize_estimate': {'type': 'alias', 'field': 'size'},
'samplerate': {'type': 'alias', 'field': 'asr'},
'video_ext': {'type': 'alias', 'field': 'vext'},
'audio_ext': {'type': 'alias', 'field': 'aext'},
'video_codec': {'type': 'alias', 'field': 'vcodec'},
'audio_codec': {'type': 'alias', 'field': 'acodec'},
'video': {'type': 'alias', 'field': 'hasvid'},
'has_video': {'type': 'alias', 'field': 'hasvid'},
'audio': {'type': 'alias', 'field': 'hasaud'},
'has_audio': {'type': 'alias', 'field': 'hasaud'},
'extractor': {'type': 'alias', 'field': 'ie_pref'},
'extractor_preference': {'type': 'alias', 'field': 'ie_pref'},
'dimension': {'type': 'alias', 'field': 'res', 'deprecated': True},
'resolution': {'type': 'alias', 'field': 'res', 'deprecated': True},
'extension': {'type': 'alias', 'field': 'ext', 'deprecated': True},
'bitrate': {'type': 'alias', 'field': 'br', 'deprecated': True},
'total_bitrate': {'type': 'alias', 'field': 'tbr', 'deprecated': True},
'video_bitrate': {'type': 'alias', 'field': 'vbr', 'deprecated': True},
'audio_bitrate': {'type': 'alias', 'field': 'abr', 'deprecated': True},
'framerate': {'type': 'alias', 'field': 'fps', 'deprecated': True},
'filesize_estimate': {'type': 'alias', 'field': 'size', 'deprecated': True},
'samplerate': {'type': 'alias', 'field': 'asr', 'deprecated': True},
'video_ext': {'type': 'alias', 'field': 'vext', 'deprecated': True},
'audio_ext': {'type': 'alias', 'field': 'aext', 'deprecated': True},
'video_codec': {'type': 'alias', 'field': 'vcodec', 'deprecated': True},
'audio_codec': {'type': 'alias', 'field': 'acodec', 'deprecated': True},
'video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
'has_video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
'audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
'has_audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
'extractor': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
'extractor_preference': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
}
def __init__(self, ie, field_preference):
@@ -1776,7 +1821,7 @@ class InfoExtractor(object):
continue
if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field')
if alias not in ('format_id', 'preference', 'language_preference'):
if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning(
f'Format sorting alias {alias} is deprecated '
f'and may be removed in a future version. Please use {field} instead')
@@ -2875,7 +2920,8 @@ class InfoExtractor(object):
segment_duration = None
if 'total_number' not in representation_ms_info and 'segment_duration' in representation_ms_info:
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration))
representation_ms_info['total_number'] = int(math.ceil(
float_or_none(period_duration, segment_duration, default=0)))
representation_ms_info['fragments'] = [{
media_location_key: media_template % {
'Number': segment_number,
@@ -2966,6 +3012,10 @@ class InfoExtractor(object):
f['url'] = initialization_url
f['fragments'].append({location_key(initialization_url): initialization_url})
f['fragments'].extend(representation_ms_info['fragments'])
if not period_duration:
period_duration = try_get(
representation_ms_info,
lambda r: sum(frag['duration'] for frag in r['fragments']), float)
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
@@ -3108,7 +3158,7 @@ class InfoExtractor(object):
})
return formats, subtitles
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None, quality=None):
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8_native', mpd_id=None, preference=None, quality=None):
def absolute_url(item_url):
return urljoin(base_url, item_url)
@@ -3636,11 +3686,11 @@ class InfoExtractor(object):
@staticmethod
def _merge_subtitle_items(subtitle_list1, subtitle_list2):
""" Merge subtitle items for one language. Items with duplicated URLs
""" Merge subtitle items for one language. Items with duplicated URLs/data
will be dropped. """
list1_urls = set([item['url'] for item in subtitle_list1])
list1_data = set((item.get('url'), item.get('data')) for item in subtitle_list1)
ret = list(subtitle_list1)
ret.extend([item for item in subtitle_list2 if item['url'] not in list1_urls])
ret.extend(item for item in subtitle_list2 if (item.get('url'), item.get('data')) not in list1_data)
return ret
@classmethod
@@ -3665,9 +3715,8 @@ class InfoExtractor(object):
def mark_watched(self, *args, **kwargs):
if not self.get_param('mark_watched', False):
return
if (self._get_login_info()[0] is not None
or self.get_param('cookiefile')
or self.get_param('cookiesfrombrowser')):
if (self.supports_login() and self._get_login_info()[0] is not None
or self.get_param('cookiefile') or self.get_param('cookiesfrombrowser')):
self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs):

Some files were not shown because too many files have changed in this diff Show More