Compare commits

...

211 Commits

Author SHA1 Message Date
Philipp Hagemeister
f3711edcf1 release 2015.12.13 2015-12-13 10:52:59 +01:00
Yen Chi Hsuan
22d07ba4e4 [infoq] Fix extraction for HTTP URLs (closes #7739) 2015-12-13 17:29:27 +08:00
Yen Chi Hsuan
f6abca506e [nowvideo] Skip deleted test case 2015-12-13 15:43:20 +08:00
Yen Chi Hsuan
b5424acdb9 [novamov] Improve existence checking 2015-12-13 15:43:20 +08:00
Yen Chi Hsuan
47c7f3d995 [novamov] Fix filekey extraction (closes #7764) 2015-12-13 15:43:20 +08:00
Sergey M․
0014ffa829 [funimation] Improve login 2015-12-13 07:17:42 +06:00
Sergey M․
c03943f394 Credit @Slyneth for funimation (#7775) 2015-12-12 15:19:23 +06:00
Yen Chi Hsuan
deb1e8d20e [youku] Put the missing item to get_hd 2015-12-12 15:49:19 +08:00
Yen Chi Hsuan
174964a7bc Credit @Celthi for fixing Youku extractor 2015-12-12 15:34:40 +08:00
Yen Chi Hsuan
9c568178fb Merge branch 'Celthi-youku_bugfix' 2015-12-12 15:26:43 +08:00
Yen Chi Hsuan
dbb7d7e26c [youku] Reorder format items 2015-12-12 15:24:58 +08:00
Yen Chi Hsuan
ade2340971 [youku] Simplify 2015-12-12 15:19:14 +08:00
Yen Chi Hsuan
4d77550cf0 [youku] Fix tests 2015-12-12 14:57:14 +08:00
Yen Chi Hsuan
c683454e7e [youku] MD5 is unstable 2015-12-12 14:48:46 +08:00
Yen Chi Hsuan
f133fd326b [youku] Cleanup and PEP8 2015-12-12 14:41:53 +08:00
Yen Chi Hsuan
1faa66f005 Merge branch 'youku_bugfix' of https://github.com/Celthi/youtube-dl into Celthi-youku_bugfix 2015-12-12 14:36:29 +08:00
Yen Chi Hsuan
8773f3158f [safari] Use postdata_urlencode (#7465) 2015-12-12 14:28:05 +08:00
Celthi
7e37c39485 merge data1 and data2 2015-12-12 11:26:15 +08:00
Celthi
14c17cafa1 add support to video protected by password 2015-12-12 11:21:44 +08:00
Celthi
8696a7fd13 fix the keyerror(mp4hd), todo support download the video protected by password 2015-12-12 10:44:21 +08:00
Sergey M․
d63cfc3f0f [beeg] API v5 (Closes #7846) 2015-12-12 02:52:20 +06:00
Sergey M․
f377f44dae [funimation] Improve extraction 2015-12-12 01:02:54 +06:00
Sergey M․
0b1bb1ac3a [funimation] Add test for promotional video 2015-12-12 00:52:00 +06:00
Sergey M․
f208e52a76 [funimation] Fix promotional videos extraction 2015-12-12 00:48:09 +06:00
Sergey M․
b091529a3c [funimation] Extend _VALID_URL to match promotional videos 2015-12-12 00:43:03 +06:00
Sergey M․
b323a3cbff [funimation] Remove unused import 2015-12-12 00:39:44 +06:00
Sergey M․
b59623ef43 [funimation] Use mobile webpage for workaround hulu error 2015-12-12 00:38:58 +06:00
Sergey M․
9c163950da [funimation] Improve _VALID_URL 2015-12-11 23:20:10 +06:00
Sergey M․
d357bbd375 [funimation] Update test 2015-12-11 23:06:44 +06:00
Sergey M?
f542a3d26b [funimation] Improve extraction (Closes #7775) 2015-12-11 23:00:38 +06:00
Sergey M?
59a4ff482a [funimation] Real UA is required for login 2015-12-11 23:00:37 +06:00
Sergey M?
40ca5b04f4 [funimation] Remove unnecessary login form field 2015-12-11 23:00:37 +06:00
Sergey M?
411e5b88c9 [funimation] Fix login message 2015-12-11 23:00:37 +06:00
Sergey M?
b4c299bad0 [funimation] PEP 8 2015-12-11 23:00:36 +06:00
Muratcan Simsek
ab4bdc913f [funimation] Add new extractor
Update funimation.py

Update funimation.py

Removed unnecessary lines.

Update funimation.py

Added thumbnail and description.

Filename improvement.

fixed TEST.
2015-12-11 23:00:35 +06:00
remitamine
1fe248a51b Merge pull request #7833 from remitamine/ooyala
[ooyala] improve extraction
2015-12-11 17:55:32 +01:00
remitamine
2559b9d017 [wdr] extract all formats(closes #7788) 2015-12-11 17:31:33 +01:00
Sergey M․
4db43567e8 [downloader/f4m] Decode manifest before fixing 2015-12-11 20:28:44 +06:00
Celthi
5333842a1d According the blog and you-get fixed the issues #7627. 2015-12-11 20:08:14 +08:00
Celthi
98c3806b15 fix some not important codesnips 2015-12-11 19:18:14 +08:00
Yen Chi Hsuan
b6afc225c8 [vevo] Use _download_smil to provide informative error messages 2015-12-11 19:16:51 +08:00
Yen Chi Hsuan
ad30dc1e20 [vevo] Allow calling API without https
Not all proxies allow CONNECT
2015-12-11 19:07:13 +08:00
Yen Chi Hsuan
ff51983e15 [vevo] Handle videos without video_info (#7802) 2015-12-11 18:52:03 +08:00
Celthi
fdf01663d1 able to download first part of the video, but fail in the left part 2015-12-11 17:48:40 +08:00
Yen Chi Hsuan
4b94288301 [vevo] Use _match_id 2015-12-11 17:32:29 +08:00
Yen Chi Hsuan
4bf99ade15 [vevo] Catch the georestriction message (#7802) 2015-12-11 14:25:01 +08:00
remitamine
75ed53320e [ooyala] improve extraction 2015-12-10 19:08:16 +01:00
Sergey M․
17b786ae73 [downloader/f4m] Fix malformed manifests (Closes #7823) 2015-12-10 22:59:50 +06:00
Sergey M
dfd42a43c3 Merge pull request #7821 from joksnet/patch-1
[FFmpegPostProcessor] Default of prefer ffmpeg
2015-12-10 22:10:20 +06:00
Philipp Hagemeister
f7b8dd63f0 release 2015.12.10 2015-12-10 17:05:13 +01:00
Sergey M․
a8abf124c8 [dailymotion] Add subtitles test URL for reference 2015-12-10 21:54:48 +06:00
Sergey M․
176ccefcd8 [pbs] PEP 8 2015-12-10 21:33:40 +06:00
Sergey M․
cbd2ffd031 [dailymotion] Fix subtitles extraction 2015-12-10 21:29:07 +06:00
Sergey M․
0b534d2adc [dailymotion] Restrict player v5 regex (Closes #7826) 2015-12-10 21:27:47 +06:00
Sergey M․
526a20bd16 [pbs] Clarify member stations' URLs 2015-12-10 21:04:26 +06:00
Celthi
51094b1b08 add cookie and referer in headers, change the video url 2015-12-10 21:42:12 +08:00
Philipp Hagemeister
f1ac2033ab Merge pull request #7827 from habi/master
Updating README.md
2015-12-10 13:54:18 +01:00
David Haberthür
a1b8d815f5 Reverting markup changes 2015-12-10 13:45:53 +01:00
David Haberthür
8b756bd98e Merge branch 'update-readme' 2015-12-10 13:20:25 +01:00
David Haberthür
46047c58d0 Updating README.md
- Harmonizing mentions of **youtube-dl** in the text
- Removing unnecessary Markdown markup for headers
- Adding some links
2015-12-10 13:19:26 +01:00
Juan M Martínez
374c761e77 [FFmpegPostProcessor] Default of prefer ffmpeg
When no `downloader` is passed to `FFmpegPostProcessor`
an exception was raised trying to get the prefer ffmpeg param.

    AttributeError: 'NoneType' object has no attribute 'params'

This fixes and defaults to `False`.
2015-12-09 20:56:00 -03:00
Sergey M․
6c7b26e13f [pbs] Make URLs lowercase 2015-12-09 21:28:04 +06:00
Sergey M․
b51b108045 [pbs] Clean up stations list from duplicates 2015-12-09 21:23:19 +06:00
Philipp Hagemeister
86e8c89488 release 2015.12.09 2015-12-09 15:32:26 +01:00
Jaime Marquínez Ferrándiz
47f48f5d85 [test/test_all_urls] Update pbs extractor name
It's in lowercase now (since e15e2ef7a0).
2015-12-08 21:12:13 +01:00
Sergey M․
e15e2ef7a0 [pbs] Add support for all member stations (#7674) 2015-12-09 01:51:34 +06:00
Sergey M․
d0c8b279da [pbs] Add another coveplayer pattern (Closes #7674) 2015-12-08 23:34:43 +06:00
Sergey M․
612d83b51d [pbs] Extend _VALID_URL 2015-12-08 23:28:36 +06:00
Sergey M
9c30efeb7e Merge pull request #7792 from jindaxia/fix_sohu_403forbidden
[sohu] Fix 403 forbidden
2015-12-08 22:54:14 +06:00
Sergey M․
39fa4cc107 [cliphunter] Fix extraction (Closes #7796) 2015-12-08 21:56:00 +06:00
Sergey M?
b09c122373 [nbc] Add another theplatform pattern 2015-12-08 21:35:42 +06:00
Sergey M
3348243b7b [README.md] Clarify verbose log requirements 2015-12-08 21:34:26 +06:00
Sergey M․
b46b65ed37 [nbc] Smuggle referer (Closes #7791) 2015-12-08 21:16:14 +06:00
Sergey M․
18e4088fad [theplatform] Add support for referer protected videos wuth explicit SMIL 2015-12-08 21:15:45 +06:00
虾哥哥
5fd6cd64f9 [sohu]fix 403 forbidden 2015-12-08 14:14:14 +08:00
Sergey M․
3d24bbfbe4 [YoutubeDL] Check formats for merge to be opposite (#7786) 2015-12-07 23:10:57 +06:00
Sergey M․
1775612512 [wimp] Improve video URL regex 2015-12-07 22:18:00 +06:00
Sergey M․
0d2d967cc7 [wimp] Fix extraction (Closes #7784) 2015-12-07 22:14:45 +06:00
Sergey M․
a5e52a1fd4 [vk] Add test for pladform embed 2015-12-07 22:05:54 +06:00
Sergey M․
291a93bafa [vk] Remove unnecessary message 2015-12-07 22:04:47 +06:00
Sergey M․
c4737bea17 [vk] Add support for pladform embeds (Closes #7780) 2015-12-07 22:03:52 +06:00
Sergey M․
45dad7ba1b [extractor/generic] Use _extract_url for pladform 2015-12-07 22:03:21 +06:00
Sergey M․
db7c9da871 [pladform] Add _extract_url routine 2015-12-07 22:02:45 +06:00
Philipp Hagemeister
bc92621ade release 2015.12.06 2015-12-06 18:51:25 +01:00
Sergey M․
fd8e559c3a [beeg] Switch to api v4 (Closes #7774) 2015-12-06 23:47:10 +06:00
Sergey M․
222e11d4ae [bbc] Add another pattern for playlist.sxml (Closes #7743) 2015-12-06 16:41:12 +06:00
Sergey M․
7d682f0acb [nowtv] Extend _VALID_URL to support jahr URLs (Closes #7755) 2015-12-06 16:18:59 +06:00
Yen Chi Hsuan
8364b6b0b1 [iqiyi] Update key
Closes #7772
2015-12-06 16:41:02 +08:00
Sergey M․
7ac40e5521 [nowvideo] Update test 2015-12-06 09:42:20 +06:00
Sergey M․
36066dd3ee [movshare] Rename to wholecloud 2015-12-06 09:42:00 +06:00
Sergey M․
636aa83ed3 [cloudtime] Add extractor 2015-12-06 09:37:38 +06:00
Sergey M․
33d152b6cc [novamov] Move all novamov based extractors to a single place
For easier navigation
2015-12-06 09:29:41 +06:00
remitamine
51c4fec0d5 [nba] use int_or_none for tbr 2015-12-05 21:04:22 +01:00
remitamine
0017486dca [nba] use int instead of int_or_none 2015-12-05 20:58:44 +01:00
Sergey M․
edc70f4aaf [pluralsight] Fix format code split while guessing quality 2015-12-06 01:40:13 +06:00
Sergey M․
756926ff00 [pluralsight] Add support for widescreen videos (Closes #7766) 2015-12-06 01:39:28 +06:00
remitamine
cb160dd531 [nba] handle format info properly 2015-12-05 18:47:15 +01:00
Jaime Marquínez Ferrándiz
77334ccb44 [metacafe] Fix age limit extraction 2015-12-05 16:12:50 +01:00
Jaime Marquínez Ferrándiz
796db21295 [metacafe] Fix video url extraction (closes #7763) 2015-12-05 16:12:02 +01:00
Philipp Hagemeister
535d7b681b release 2015.12.05 2015-12-05 16:01:37 +01:00
Sergey M․
960e038886 [hypem] Modernize 2015-12-05 20:46:57 +06:00
Sergey M․
ea14422ff1 [hypem] Correctly handle cookies (Closes #7762) 2015-12-05 20:42:21 +06:00
Yen Chi Hsuan
38d05d17e5 [fc2] Fix test_FC2_1 2015-12-05 21:10:26 +08:00
Yen Chi Hsuan
db9bd5267f [keezmovies] Fix extraction
Also fixes #7752
2015-12-05 17:26:13 +08:00
remitamine
ab3b773bbe [acast] change tests into more stable casts and work with channel extractor only if it didn't match cast regex 2015-12-05 10:14:34 +01:00
Yen Chi Hsuan
0bc4ee60e0 [bbc] Fix test_BBC_6 2015-12-05 16:55:53 +08:00
Yen Chi Hsuan
a3ef0e1cdd [bbc.co.uk] Skip removed test video 2015-12-05 16:55:53 +08:00
Yen Chi Hsuan
679bacf0b5 [bbc.co.uk] Fix test_BBCCoUk
This is similar to the one in #7756, So also fixes #7756.
2015-12-05 16:55:53 +08:00
remitamine
02e3952f3b [trilulilu] handle errors 2015-12-05 09:42:00 +01:00
Yen Chi Hsuan
64b7e89c0c [srf] Support audios (closes #7760) 2015-12-05 16:26:30 +08:00
remitamine
bee4c5571a [clipfish] improve extraction 2015-12-04 16:38:05 +01:00
remitamine
96929dd1e8 [skynewsarabia] fix extractor name 2015-12-04 16:23:44 +01:00
remitamine
53e06b2507 [ooyala] fix duration scale 2015-12-04 16:18:02 +01:00
remitamine
b80d4bebf3 [nba] fix extraction errors 2015-12-04 16:04:22 +01:00
Jaime Marquínez Ferrándiz
55bec9b658 [clipfish] Remove unused import and style fix 2015-12-04 14:29:37 +01:00
Jaime Marquínez Ferrándiz
2a63b0f110 [mixcloud] Fix extraction of the audio url (fixes #7751) 2015-12-04 14:26:34 +01:00
remitamine
07b88cffce Merge pull request #7686 from remitamine/acast
[acast] Add new extractor
2015-12-04 09:10:02 +01:00
remitamine
58c8451f36 Merge pull request #7660 from remitamine/gameinformer
[gameinformer] Add new extractor(closes #3376)
2015-12-04 09:03:21 +01:00
remitamine
3047121c63 Merge pull request #7320 from remitamine/adobetv
[adobetv] improve extraction and add support specific language video,show and channel extraction
2015-12-04 08:54:06 +01:00
remitamine
7079f8ff1f [adobetv] use compat_str 2015-12-04 08:44:18 +01:00
remitamine
2c3b9f3570 [adobetv] use a variable for api base url 2015-12-04 08:37:08 +01:00
remitamine
fad2428f47 [gameinformer] split long line 2015-12-04 08:24:04 +01:00
remitamine
c3d3110f6a Merge pull request #7185 from remitamine/ooyala
[ooyala] extract more formats and metadata
2015-12-04 08:23:21 +01:00
remitamine
79ec00276c Merge pull request #7326 from remitamine/clipfish
[clipfish] improve info extraction
2015-12-04 07:57:58 +01:00
remitamine
9c117d345f [nba] improve(fixes #7068)
* extract more formats
* extract videos from team mini sites
* extract more metadata
2015-12-04 07:20:27 +01:00
remitamine
46cc1c65a4 [nba] use xpath utils 2015-12-04 07:09:48 +01:00
remitamine
71d9fe7818 [trilulilu] improve extraction 2015-12-04 06:53:33 +01:00
remitamine
4ccabf93db [trilulilu] fix info extraction 2015-12-04 00:51:02 +01:00
remitamine
6612a34939 [bilibili] flake8 2015-12-03 22:43:19 +01:00
remitamine
e5b4225f7c [audimedia] flake8 2015-12-03 22:25:08 +01:00
remitamine
b2ca35ddbc Merge pull request #7745 from remitamine/bilibili
[bilibili] use xpath_text and catch errors in xml document
2015-12-03 22:11:41 +01:00
remitamine
76ab842d9b [bilibili] use xpath_text and catch errors in xml document 2015-12-03 22:01:32 +01:00
remitamine
24dc1ed715 Merge pull request #7659 from remitamine/audimedia
[audimedia] Add new extractor(closes #7654)
2015-12-03 20:28:52 +01:00
remitamine
682d8dcd21 Merge pull request #7210 from remitamine/bilibili
[bilibili] fix info extraction(fixes #7182)
2015-12-03 20:16:54 +01:00
remitamine
640bb54e73 Merge branch 'master' of https://github.com/rg3/youtube-dl into bilibili 2015-12-03 20:05:11 +01:00
Sergey M․
e0977d7686 [beeg] Decrypt URL (Closes #7736) 2015-12-04 00:59:32 +06:00
remitamine
112ab398db Merge pull request #7681 from remitamine/skynewarabia
[skynewsarabia] Add new extractor
2015-12-03 18:41:38 +01:00
Sergey M․
af93fcfa05 [beeg] Update API URL (Closes #7736) 2015-12-03 23:23:36 +06:00
Sergey M․
62d231c004 [extractor/common] Clarify duration can be float 2015-12-03 20:55:02 +06:00
Sergey M․
49358274d7 [bbc] Fix _VALID_URL 2015-12-03 20:49:14 +06:00
Jaime Marquínez Ferrándiz
7b1e379ca9 [gametrailers] Fix extraction (fixes #7722)
They have stopped using the MTV system.
2015-12-03 13:47:21 +01:00
Sergey M․
22d7368dfb [bbc] Extract _ID_REGEX and ad one more video id pattern (Closes #7724) 2015-12-02 02:34:31 +06:00
Sergey M․
24121bc703 [udemy] Make lecture downloading fatal 2015-12-02 00:53:03 +06:00
Sergey M․
9fc87fa767 [udemy] Remove unused import 2015-12-02 00:51:47 +06:00
Sergey M․
328f82d59a [udemy] Semi-switch to api 2.0 (Closes #7704)
* Use api 2.0 to get lectures since it provides more formats
* Fix authorization for api 2.0
* Autotry enrolling in the course for single lectures
* Extract additional metadata rom asset['data']['outputs']
2015-12-02 00:48:27 +06:00
Sergey M․
78717fc328 [udemy] Allow authentication via cookies 2015-12-01 22:10:10 +06:00
Sergey M․
3b35c3425e [udemy] Extract formats from data.outputs (#7704) 2015-12-01 20:35:46 +06:00
Sergey M․
874ae0354e [nrk] Extract f4m formats and impose geo restriction only when not media URL (Closes #7715) 2015-12-01 18:35:24 +06:00
Sergey M․
4c6b4764f0 [youtube] Clarify itag 272 possible resolutions (#7699) 2015-11-30 20:42:05 +06:00
Sergey M․
59ee8a8647 [facebook] Make alternative title optional (Closes #7700) 2015-11-30 20:10:09 +06:00
Sergey M․
af284305d5 [vodlocker] Capture file not found error (Closes #7696) 2015-11-30 03:58:39 +06:00
Sergey M․
d53a4af1a4 [pornhub:playlist] Allow alphanumeric viewkeys (Closes #7695) 2015-11-30 03:47:01 +06:00
Sergey M․
2e1b928540 [youtube:playlist] Extend _VALID_URL 2015-11-29 21:04:11 +06:00
Sergey M․
040ac68679 [youtube] Extend _VALID_URL (Closes #7694) 2015-11-29 21:01:59 +06:00
Yen Chi Hsuan
049d71d874 [youtube] Simplify and make sure header values are strings 2015-11-29 19:52:48 +08:00
Sergey M․
bf2c8c8f82 [spiegel] Fix extraction (Closes #7693) 2015-11-29 17:03:33 +06:00
Yen Chi Hsuan
ef428960c9 Merge pull request #7691 from ryandesign/use-PYTHON-env-var
Always use PYTHON env var in Makefile
2015-11-29 13:08:46 +08:00
Yen Chi Hsuan
992fc9d6e1 [utils] Refactor handle_youtubedl_headers for future extension 2015-11-29 12:58:29 +08:00
Ryan Schmidt
8639f89f51 Always use PYTHON env var in Makefile 2015-11-28 22:56:24 -06:00
Yen Chi Hsuan
0424ec307b [utils] Correct docstring of YoutubeDLHandler 2015-11-29 12:46:04 +08:00
Yen Chi Hsuan
ac5a69af45 [youtube] Disable compression for live streams 2015-11-29 12:44:24 +08:00
Yen Chi Hsuan
94e8c80473 [downloader/hls] Respect Youtubedl-* headers 2015-11-29 12:43:59 +08:00
Yen Chi Hsuan
87f0e62d94 [utils] Separate codes for handling Youtubedl-* headers 2015-11-29 12:42:50 +08:00
remitamine
46b4070f3f Merge pull request #7057 from remitamine/cspan
[cspan] correct the clip info extraction (fixes #7335)
2015-11-28 21:36:52 +01:00
remitamine
2a776f9788 [cspan] change into a function 2015-11-28 20:22:31 +01:00
remitamine
f4c7ef9862 [skynewsarabia] return empty categories array if there is no topic 2015-11-28 18:20:44 +01:00
remitamine
50e12e9df1 [acast] Add new extractor 2015-11-28 18:10:37 +01:00
Sergey M․
b7faebbac8 [bloomberg] Improve formats extraction 2015-11-28 22:45:19 +06:00
Sergey M․
4191fdf147 [bloomberg] Improve video id regex 2015-11-28 22:41:39 +06:00
Sergey M․
9a4f12be98 [bloomberg] Modernize 2015-11-28 22:40:29 +06:00
Sergey M․
7ad4258add [bloomberg] Relax _VALID_URL even more (Closes #7685) 2015-11-28 22:39:36 +06:00
Sergey M․
9945c4994c Credit @reiv for soundcloud:search 2015-11-28 20:21:03 +06:00
Sergey M․
5faf9fed7e [youtube] Clarify rationale for yt:stretch validation 2015-11-28 18:50:21 +06:00
Sergey M
13a9b69b09 Merge pull request #7677 from lalinsky/yt-stretch-zero-height
[youtube] Ignore yt:stretch with zero width/height
2015-11-28 18:14:06 +06:00
remitamine
4975650e00 [skynewsarabia] fix IE_NAME 2015-11-28 12:20:39 +01:00
remitamine
0cc7178546 [skynewsarabia] Add new extractor 2015-11-28 11:48:18 +01:00
Lukáš Lalinský
41f24c321d [youtube] Use the existing w and h variables 2015-11-28 08:16:46 +01:00
Yen Chi Hsuan
4b3fbafdd2 [options] Changed wording for --list-formats
As proposed by @dstftw at 9bff48a0e7
2015-11-28 14:14:20 +08:00
Sergey M․
7ac40086f5 [dbtv] Expand _VALID_URL (Closes #7645) 2015-11-28 08:44:13 +06:00
Lukáš Lalinský
313dfc45f5 [youtube] Ignore yt:stretch with zero width/height 2015-11-28 01:07:07 +01:00
Philipp Hagemeister
78a55d7a28 release 2015.11.27.1 2015-11-27 16:39:59 +01:00
Philipp Hagemeister
bb6ac83698 release 2015.11.27 2015-11-27 16:32:51 +01:00
Yen Chi Hsuan
9d0e366880 [downloader/hls] Remove Accept-encoding from headers passed to ffmpeg
Fails for Youtube Gaming live streams (#7671)
2015-11-27 21:37:45 +08:00
Yen Chi Hsuan
9bff48a0e7 [options] Clarify --list-formats needs videos (closes #7669) 2015-11-27 21:24:39 +08:00
remitamine
60121eb514 [gameinformer] Add new extractor 2015-11-26 22:43:31 +01:00
remitamine
527ca1da4f [audimedia] Add new extractor(closes #7654) 2015-11-26 21:24:10 +01:00
Sergey M
7689413e42 [README.md] Mention mplayer and mpv in "other programs" question 2015-11-24 23:06:21 +06:00
Philipp Hagemeister
ba7a92b0ce release 2015.11.24 2015-11-24 07:46:38 +01:00
Philipp Hagemeister
4c7d816dd7 [jsinterp] Adapt to updated YouTube code generation (Fixes #7623, fixes #7624, fixes #7625, fixes #7626) 2015-11-24 07:45:38 +01:00
Philipp Hagemeister
032f2f260f README: Document which other programs may be helpful (Fixes #7621) 2015-11-24 03:38:46 +01:00
remitamine
240384afe6 [clipfish] improve info extraction 2015-10-30 20:06:38 +01:00
remitamine
9a605c8859 [adobetv] add support for show and channel extraction 2015-10-29 20:00:27 +01:00
remitamine
402ca40c9d [adobetv] extract AdobeTVVideo info from json directly 2015-10-29 19:55:04 +01:00
remitamine
30bd1c16c8 [adobetv] use api for extraction and add support specific language videos 2015-10-29 19:44:26 +01:00
remitamine
497f5fd93f [bilibili] extract multiple backup_urls 2015-10-21 08:24:05 +01:00
remitamine
4bf5614195 [cspan] move get_text_attr to CSpanIE 2015-10-20 07:43:39 +01:00
remitamine
520e753390 [bilibili] add support for specefic page extraction 2015-10-17 23:12:58 +01:00
remitamine
355c7ad361 [cspan] handle error massages and extract qualities 2015-10-17 21:30:38 +01:00
remitamine
55af2b26e0 [bilibili] extract backup url 2015-10-17 18:30:51 +01:00
remitamine
d90e40305b [bilibili] fix info extraction 2015-10-17 17:28:09 +01:00
remitamine
cce9d15d01 [ooyala] extract domain,handle errors and change related tests 2015-10-16 16:02:40 +01:00
remitamine
dd414c970b [ooyala] fix sorting and format id 2015-10-16 10:12:42 +01:00
remitamine
497ca088a6 [ooyala] remove print statment 2015-10-15 14:37:05 +01:00
remitamine
90bddb6cdd [ooyala] extract more formats and metadata 2015-10-15 14:28:56 +01:00
remitamine
6a11bb77ba [nba] add support for team subsites 2015-10-07 12:17:32 +01:00
remitamine
ecf6de5b02 [nba] extract width,height and bitrate from format key 2015-10-07 07:09:45 +01:00
remitamine
139f27827e [nba] skip Legacy Video Files 2015-10-07 06:53:19 +01:00
remitamine
30787f7259 [cspan] correct the clip info extraction 2015-10-03 19:28:48 +01:00
remitamine
c233e6bcc3 [nba] extract video info from xml feed 2015-10-03 12:30:05 +01:00
remitamine
28809ab07a [nba] extract more formats 2015-10-03 09:47:19 +01:00
remitamine
8fc226ef99 [nba] extract all video formats and extract more info 2015-10-02 17:24:30 +01:00
73 changed files with 2035 additions and 835 deletions

View File

@@ -146,3 +146,6 @@ Lukáš Lalinský
Qijiang Fan
Rémy Léone
Marco Ferragina
reiv
Muratcan Simsek
Evan Lu

View File

@@ -1,4 +1,18 @@
**Please include the full output of youtube-dl when run with `-v`**.
**Please include the full output of youtube-dl when run with `-v`**, i.e. add `-v` flag to your command line, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
```
$ youtube-dl -v http://www.youtube.com/watch?v=BaW_jenozKcj
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.12.06
[debug] Git HEAD: 135392e
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
```
**Do not post screenshots of verbose log only plain text is acceptable.**
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
@@ -20,7 +34,7 @@ For bug reports, this means that your report should contain the *complete* outpu
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `http://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `http://www.youtube.com/`) is *not* an example URL.
### Are you using the latest version?
@@ -28,7 +42,7 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough?

View File

@@ -61,34 +61,34 @@ youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
chmod a+x youtube-dl
README.md: youtube_dl/*.py youtube_dl/*/*.py
COLUMNS=80 python youtube_dl/__main__.py --help | python devscripts/make_readme.py
COLUMNS=80 $(PYTHON) youtube_dl/__main__.py --help | $(PYTHON) devscripts/make_readme.py
CONTRIBUTING.md: README.md
python devscripts/make_contributing.py README.md CONTRIBUTING.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
supportedsites:
python devscripts/make_supportedsites.py docs/supportedsites.md
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
README.txt: README.md
pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md
python devscripts/prepare_manpage.py >youtube-dl.1.temp.md
$(PYTHON) devscripts/prepare_manpage.py >youtube-dl.1.temp.md
pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md
youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in
python devscripts/bash-completion.py
$(PYTHON) devscripts/bash-completion.py
bash-completion: youtube-dl.bash-completion
youtube-dl.zsh: youtube_dl/*.py youtube_dl/*/*.py devscripts/zsh-completion.in
python devscripts/zsh-completion.py
$(PYTHON) devscripts/zsh-completion.py
zsh-completion: youtube-dl.zsh
youtube-dl.fish: youtube_dl/*.py youtube_dl/*/*.py devscripts/fish-completion.in
python devscripts/fish-completion.py
$(PYTHON) devscripts/fish-completion.py
fish-completion: youtube-dl.fish

View File

@@ -35,7 +35,7 @@ You can also use pip:
sudo pip install youtube-dl
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see https://rg3.github.io/youtube-dl/download.html .
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
@@ -319,7 +319,8 @@ which means you can modify it, redistribute it or use it however you like.
--all-formats Download all available video formats
--prefer-free-formats Prefer free video formats unless a specific
one is requested
-F, --list-formats List all available formats
-F, --list-formats List all available formats of requested
videos
--youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g.
@@ -413,7 +414,7 @@ You can configure youtube-dl by placing any supported command line option to a c
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
### Authentication with `.netrc` file ###
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on per extractor basis. For that you will need to create a`.netrc` file in your `$HOME` and restrict permissions to read/write by you only:
```
@@ -534,6 +535,12 @@ Most people asking this question are not aware that youtube-dl now defaults to d
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs?
youtube-dl works fine on its own on most sites. However, if you want to convert video/audio, you'll need [avconv](https://libav.org/) or [ffmpeg](https://www.ffmpeg.org/). On some sites - most notably YouTube - videos can be retrieved in a higher quality format without sound. youtube-dl will detect whether avconv/ffmpeg is present and automatically pick the best option.
Videos or video formats streamed via RTMP protocol can only be downloaded when [rtmpdump](https://rtmpdump.mplayerhq.hu/) is installed. Downloading MMS and RTSP videos requires either [mplayer](http://mplayerhq.hu/) or [mpv](https://mpv.io/) to be installed.
### I have downloaded a video but how can I play it?
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
@@ -552,11 +559,11 @@ If you want to play the video on a machine that is not running youtube-dl, you c
YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### ERROR: unable to download video ###
### ERROR: unable to download video
YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command` ###
### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command`
That's actually the output from your shell. Since ampersand is one of the special shell characters it's interpreted by the shell preventing you from passing the whole URL to youtube-dl. To disable your shell from interpreting the ampersands (or any other special characters) you have to either put the whole URL in quotes or escape them with a backslash (which approach will work depends on your shell).
@@ -580,7 +587,7 @@ In February 2015, the new YouTube player contained a character sequence in a str
These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ###
### SyntaxError: Non-ASCII character
The error
@@ -609,7 +616,7 @@ From then on, after restarting your shell, you will be able to access both youtu
Use the `-o` to specify an [output template](#output-template), for example `-o "/home/user/videos/%(title)s-%(id)s.%(ext)s"`. If you want this for all of your downloads, put the option into your [configuration file](#configuration).
### How do I download a video starting with a `-` ?
### How do I download a video starting with a `-`?
Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the options with `--`:
@@ -791,9 +798,23 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
# BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the irc channel #youtube-dl on freenode.
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
**Please include the full output of youtube-dl when run with `-v`**.
**Please include the full output of youtube-dl when run with `-v`**, i.e. add `-v` flag to your command line, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
```
$ youtube-dl -v http://www.youtube.com/watch?v=BaW_jenozKcj
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.12.06
[debug] Git HEAD: 135392e
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
```
**Do not post screenshots of verbose log only plain text is acceptable.**
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
@@ -815,7 +836,7 @@ For bug reports, this means that your report should contain the *complete* outpu
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `http://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `http://www.youtube.com/`) is *not* an example URL.
### Are you using the latest version?
@@ -823,7 +844,7 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough?
@@ -853,4 +874,4 @@ It may sound strange, but some bug reports we receive are completely unrelated t
youtube-dl is released into the public domain by the copyright holders.
This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
This README file was originally written by [Daniel Bolton](https://github.com/dbbolton) and is likewise released into the public domain.

View File

@@ -15,8 +15,12 @@
- **abc.net.au**
- **Abc7News**
- **AcademicEarth:Course**
- **acast**
- **acast:channel**
- **AddAnime**
- **AdobeTV**
- **AdobeTVChannel**
- **AdobeTVShow**
- **AdobeTVVideo**
- **AdultSwim**
- **Aftenposten**
@@ -43,6 +47,7 @@
- **arte.tv:future**
- **AtresPlayer**
- **ATTTechChannel**
- **AudiMedia**
- **audiomack**
- **audiomack:album**
- **Azubu**
@@ -92,6 +97,7 @@
- **Clipfish**
- **cliphunter**
- **Clipsyndicate**
- **cloudtime**: CloudTime
- **Cloudy**
- **Clubic**
- **Clyp**
@@ -182,7 +188,9 @@
- **Freesound**
- **freespeech.org**
- **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **GameInformer**
- **Gamekings**
- **GameOne**
- **gameone:playlist**
@@ -307,7 +315,6 @@
- **MovieClips**
- **MovieFap**
- **Moviezine**
- **movshare**: MovShare
- **MPORA**
- **MSNBC**
- **MTV**
@@ -394,7 +401,7 @@
- **orf:tvthek**: ORF TVthek
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
- **PBS**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **Periscope**: Periscope
- **PhilharmonieDeParis**: Philharmonie de Paris
- **Phoenix**
@@ -480,6 +487,8 @@
- **Shared**: shared.sx and vivo.sx
- **ShareSix**
- **Sina**
- **skynewsarabia:video**
- **skynewsarabia:video**
- **Slideshare**
- **Slutload**
- **smotri**: Smotri.com
@@ -665,6 +674,7 @@
- **WebOfStories**
- **WebOfStoriesPlaylist**
- **Weibo**
- **wholecloud**: WholeCloud
- **Wimp**
- **Wistia**
- **WNL**

View File

@@ -121,8 +121,8 @@ class TestAllURLsMatching(unittest.TestCase):
def test_pbs(self):
# https://github.com/rg3/youtube-dl/issues/2350
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['PBS'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['PBS'])
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
def test_yahoo_https(self):
# https://github.com/rg3/youtube-dl/issues/2701

View File

@@ -1110,6 +1110,12 @@ class YoutubeDL(object):
'contain the video, try using '
'"-f %s+%s"' % (format_2, format_1))
return
# Formats must be opposite (video+audio)
if formats_info[0].get('acodec') == 'none' and formats_info[1].get('acodec') == 'none':
self.report_error(
'Both formats %s and %s are video-only, you must specify "-f video+audio"'
% (format_1, format_2))
return
output_ext = (
formats_info[0]['ext']
if self.params.get('merge_output_format') is None

View File

@@ -15,6 +15,7 @@ from ..compat import (
)
from ..utils import (
encodeFilename,
fix_xml_ampersands,
sanitize_open,
struct_pack,
struct_unpack,
@@ -288,7 +289,10 @@ class F4mFD(FragmentFD):
self.to_screen('[%s] Downloading f4m manifest' % self.FD_NAME)
urlh = self.ydl.urlopen(man_url)
man_url = urlh.geturl()
manifest = urlh.read()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244
# and https://github.com/rg3/youtube-dl/issues/7823)
manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip()
doc = compat_etree_fromstring(manifest)
formats = [(int(f.attrib.get('bitrate', -1)), f)

View File

@@ -13,6 +13,7 @@ from ..utils import (
encodeArgument,
encodeFilename,
sanitize_open,
handle_youtubedl_headers,
)
@@ -33,9 +34,10 @@ class HlsFD(FileDownloader):
if info_dict['http_headers'] and re.match(r'^https?://', url):
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
headers = handle_youtubedl_headers(info_dict['http_headers'])
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in info_dict['http_headers'].items())]
''.join('%s: %s\r\n' % (key, val) for key, val in headers.items())]
args += ['-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc']

View File

@@ -3,9 +3,15 @@ from __future__ import unicode_literals
from .abc import ABCIE
from .abc7news import Abc7NewsIE
from .academicearth import AcademicEarthCourseIE
from .acast import (
ACastIE,
ACastChannelIE,
)
from .addanime import AddAnimeIE
from .adobetv import (
AdobeTVIE,
AdobeTVShowIE,
AdobeTVChannelIE,
AdobeTVVideoIE,
)
from .adultswim import AdultSwimIE
@@ -38,6 +44,7 @@ from .arte import (
)
from .atresplayer import AtresPlayerIE
from .atttechchannel import ATTTechChannelIE
from .audimedia import AudiMediaIE
from .audiomack import AudiomackIE, AudiomackAlbumIE
from .azubu import AzubuIE
from .baidu import BaiduVideoIE
@@ -199,7 +206,9 @@ from .francetv import (
from .freesound import FreesoundIE
from .freespeech import FreespeechIE
from .freevideo import FreeVideoIE
from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE
from .gameinformer import GameInformerIE
from .gamekings import GamekingsIE
from .gameone import (
GameOneIE,
@@ -349,7 +358,6 @@ from .motherless import MotherlessIE
from .motorsport import MotorsportIE
from .movieclips import MovieClipsIE
from .moviezine import MoviezineIE
from .movshare import MovShareIE
from .mtv import (
MTVIE,
MTVServicesEmbeddedIE,
@@ -415,7 +423,13 @@ from .noco import NocoIE
from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE
from .nova import NovaIE
from .novamov import NovaMovIE
from .novamov import (
NovaMovIE,
WholeCloudIE,
NowVideoIE,
VideoWeedIE,
CloudTimeIE,
)
from .nowness import (
NownessIE,
NownessPlaylistIE,
@@ -425,7 +439,6 @@ from .nowtv import (
NowTVIE,
NowTVListIE,
)
from .nowvideo import NowVideoIE
from .npo import (
NPOIE,
NPOLiveIE,
@@ -554,6 +567,10 @@ from .shahid import ShahidIE
from .shared import SharedIE
from .sharesix import ShareSixIE
from .sina import SinaIE
from .skynewsarabia import (
SkyNewsArabiaIE,
SkyNewsArabiaArticleIE,
)
from .slideshare import SlideshareIE
from .slutload import SlutloadIE
from .smotri import (
@@ -732,7 +749,6 @@ from .videofyme import VideofyMeIE
from .videomega import VideoMegaIE
from .videopremium import VideoPremiumIE
from .videott import VideoTtIE
from .videoweed import VideoWeedIE
from .vidme import VidmeIE
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE

View File

@@ -0,0 +1,70 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none
class ACastBaseIE(InfoExtractor):
_API_BASE_URL = 'https://www.acast.com/api/'
class ACastIE(ACastBaseIE):
IE_NAME = 'acast'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
_TEST = {
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
'timestamp': 1196172000000,
'description': 'md5:0c5d8201dfea2b93218ea986c91eee6e',
'duration': 211,
}
}
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
cast_data = self._download_json(self._API_BASE_URL + 'channels/%s/acasts/%s/playback' % (channel, display_id), display_id)
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
'url': cast_data['blings'][0]['audio'],
'title': cast_data['name'],
'description': cast_data.get('description'),
'thumbnail': cast_data.get('image'),
'timestamp': int_or_none(cast_data.get('publishingDate')),
'duration': int_or_none(cast_data.get('duration')),
}
class ACastChannelIE(ACastBaseIE):
IE_NAME = 'acast:channel'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<id>[^/#?]+)'
_TEST = {
'url': 'https://www.acast.com/condenasttraveler',
'info_dict': {
'id': '50544219-29bb-499e-a083-6087f4cb7797',
'title': 'Condé Nast Traveler Podcast',
'description': 'md5:98646dee22a5b386626ae31866638fbd',
},
'playlist_mincount': 20,
}
@classmethod
def suitable(cls, url):
return False if ACastIE.suitable(url) else super(ACastChannelIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
channel_data = self._download_json(self._API_BASE_URL + 'channels/%s' % display_id, display_id)
casts = self._download_json(self._API_BASE_URL + 'channels/%s/acasts' % display_id, display_id)
entries = [self.url_result('https://www.acast.com/%s/%s' % (display_id, cast['url']), 'ACast') for cast in casts]
return self.playlist_result(entries, compat_str(channel_data['id']), channel_data['name'], channel_data.get('description'))

View File

@@ -1,23 +1,32 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
parse_duration,
unified_strdate,
str_to_int,
int_or_none,
float_or_none,
ISO639Utils,
determine_ext,
)
class AdobeTVIE(InfoExtractor):
_VALID_URL = r'https?://tv\.adobe\.com/watch/[^/]+/(?P<id>[^/]+)'
class AdobeTVBaseIE(InfoExtractor):
_API_BASE_URL = 'http://tv.adobe.com/api/v4/'
class AdobeTVIE(AdobeTVBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/watch/the-complete-picture-with-julieanne-kost/quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop/',
'md5': '9bc5727bcdd55251f35ad311ca74fa1e',
'info_dict': {
'id': 'quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop',
'id': '10981',
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
@@ -29,50 +38,106 @@ class AdobeTVIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
language, show_urlname, urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
player = self._parse_json(
self._search_regex(r'html5player:\s*({.+?})\s*\n', webpage, 'player'),
video_id)
title = player.get('title') or self._search_regex(
r'data-title="([^"]+)"', webpage, 'title')
description = self._og_search_description(webpage)
thumbnail = self._og_search_thumbnail(webpage)
upload_date = unified_strdate(
self._html_search_meta('datepublished', webpage, 'upload date'))
duration = parse_duration(
self._html_search_meta('duration', webpage, 'duration') or
self._search_regex(
r'Runtime:\s*(\d{2}:\d{2}:\d{2})',
webpage, 'duration', fatal=False))
view_count = str_to_int(self._search_regex(
r'<div class="views">\s*Views?:\s*([\d,.]+)\s*</div>',
webpage, 'view count'))
video_data = self._download_json(
self._API_BASE_URL + 'episode/get/?language=%s&show_urlname=%s&urlname=%s&disclosure=standard' % (language, show_urlname, urlname),
urlname)['data'][0]
formats = [{
'url': source['src'],
'format_id': source.get('quality') or source['src'].split('-')[-1].split('.')[0] or None,
'tbr': source.get('bitrate'),
} for source in player['sources']]
'url': source['url'],
'format_id': source.get('quality_level') or source['url'].split('-')[-1].split('.')[0] or None,
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
} for source in video_data['videos']]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
'duration': duration,
'view_count': view_count,
'id': compat_str(video_data['id']),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
}
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
def _parse_page_data(self, page_data):
return [self.url_result(self._get_element_url(element_data)) for element_data in page_data]
def _extract_playlist_entries(self, url, display_id):
page = self._download_json(url, display_id)
entries = self._parse_page_data(page['data'])
for page_num in range(2, page['paging']['pages'] + 1):
entries.extend(self._parse_page_data(
self._download_json(url + '&page=%d' % page_num, display_id)['data']))
return entries
class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/show/the-complete-picture-with-julieanne-kost',
'info_dict': {
'id': '36',
'title': 'The Complete Picture with Julieanne Kost',
'description': 'md5:fa50867102dcd1aa0ddf2ab039311b27',
},
'playlist_mincount': 136,
}
def _get_element_url(self, element_data):
return element_data['urls'][0]
def _real_extract(self, url):
language, show_urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
query = 'language=%s&show_urlname=%s' % (language, show_urlname)
show_data = self._download_json(self._API_BASE_URL + 'show/get/?%s' % query, show_urlname)['data'][0]
return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'episode/?%s' % query, show_urlname),
compat_str(show_data['id']),
show_data['show_name'],
show_data['show_description'])
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = {
'url': 'http://tv.adobe.com/channel/development',
'info_dict': {
'id': 'development',
},
'playlist_mincount': 96,
}
def _get_element_url(self, element_data):
return element_data['url']
def _real_extract(self, url):
language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
query = 'language=%s&channel_urlname=%s' % (language, channel_urlname)
if category_urlname:
query += '&category_urlname=%s' % category_urlname
return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'show/?%s' % query, channel_urlname),
channel_urlname)
class AdobeTVVideoIE(InfoExtractor):
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
@@ -91,28 +156,25 @@ class AdobeTVVideoIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_params = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'player parameters'),
video_id)
video_data = self._download_json(url + '?format=json', video_id)
formats = [{
'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),
'url': source['src'],
'width': source.get('width'),
'height': source.get('height'),
'tbr': source.get('bitrate'),
} for source in player_params['sources']]
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('bitrate')),
} for source in video_data['sources']]
self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among
# formats. I just pick the max one
duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000)
for source in player_params['sources']]))
for source in video_data['sources']]))
subtitles = {}
for translation in player_params.get('translations', []):
for translation in video_data.get('translations', []):
lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
if lang_id not in subtitles:
subtitles[lang_id] = []
@@ -124,8 +186,9 @@ class AdobeTVVideoIE(InfoExtractor):
return {
'id': video_id,
'formats': formats,
'title': player_params['title'],
'description': self._og_search_description(webpage),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data['video'].get('poster'),
'duration': duration,
'subtitles': subtitles,
}

View File

@@ -0,0 +1,80 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
sanitized_Request,
)
class AudiMediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audimedia\.tv/(?:en|de)/vid/(?P<id>[^/?#]+)'
_TEST = {
'url': 'https://audimedia.tv/en/vid/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test',
'md5': '79a8b71c46d49042609795ab59779b66',
'info_dict': {
'id': '1564',
'ext': 'mp4',
'title': '60 Seconds of Audi Sport 104/2015 - WEC Bahrain, Rookie Test',
'description': 'md5:60e5d30a78ced725f7b8d34370762941',
'upload_date': '20151124',
'timestamp': 1448354940,
'duration': 74022,
'view_count': int,
}
}
# extracted from https://audimedia.tv/assets/embed/embedded-player.js (dataSourceAuthToken)
_AUTH_TOKEN = 'e25b42847dba18c6c8816d5d8ce94c326e06823ebf0859ed164b3ba169be97f2'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
raw_payload = self._search_regex(r'<script[^>]+class="amtv-embed"[^>]+id="([^"]+)"', webpage, 'raw payload')
_, stage_mode, video_id, lang = raw_payload.split('-')
# TODO: handle s and e stage_mode (live streams and ended live streams)
if stage_mode not in ('s', 'e'):
request = sanitized_Request(
'https://audimedia.tv/api/video/v1/videos/%s?embed[]=video_versions&embed[]=thumbnail_image&where[content_language_iso]=%s' % (video_id, lang),
headers={'X-Auth-Token': self._AUTH_TOKEN})
json_data = self._download_json(request, video_id)['results']
formats = []
stream_url_hls = json_data.get('stream_url_hls')
if stream_url_hls:
m3u8_formats = self._extract_m3u8_formats(stream_url_hls, video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
stream_url_hds = json_data.get('stream_url_hds')
if stream_url_hds:
f4m_formats = self._extract_f4m_formats(json_data.get('stream_url_hds') + '?hdcore=3.4.0', video_id, -1, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
for video_version in json_data.get('video_versions'):
video_version_url = video_version.get('download_url') or video_version.get('stream_url')
if not video_version_url:
continue
formats.append({
'url': video_version_url,
'width': int_or_none(video_version.get('width')),
'height': int_or_none(video_version.get('height')),
'abr': int_or_none(video_version.get('audio_bitrate')),
'vbr': int_or_none(video_version.get('video_bitrate')),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': json_data['title'],
'description': json_data.get('subtitle'),
'thumbnail': json_data.get('thumbnail_image', {}).get('file'),
'timestamp': parse_iso8601(json_data.get('publication_date')),
'duration': int_or_none(json_data.get('duration')),
'view_count': int_or_none(json_data.get('view_count')),
'formats': formats,
}

View File

@@ -22,7 +22,8 @@ from ..compat import (
class BBCCoUkIE(InfoExtractor):
IE_NAME = 'bbc.co.uk'
IE_DESC = 'BBC iPlayer'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:programmes/(?!articles/)|iplayer(?:/[^/]+)?/(?:episode/|playlist/))|music/clips[/#])(?P<id>[\da-z]{8})'
_ID_REGEX = r'[pb][\da-z]{7}'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:programmes/(?!articles/)|iplayer(?:/[^/]+)?/(?:episode/|playlist/))|music/clips[/#])(?P<id>%s)' % _ID_REGEX
_MEDIASELECTOR_URLS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails
@@ -46,9 +47,8 @@ class BBCCoUkIE(InfoExtractor):
'info_dict': {
'id': 'b039d07m',
'ext': 'flv',
'title': 'Kaleidoscope, Leonard Cohen',
'title': 'Leonard Cohen, Kaleidoscope - BBC Radio 4',
'description': 'The Canadian poet and songwriter reflects on his musical career.',
'duration': 1740,
},
'params': {
# rtmp download
@@ -111,7 +111,8 @@ class BBCCoUkIE(InfoExtractor):
'params': {
# rtmp download
'skip_download': True,
}
},
'skip': 'Episode is no longer available on BBC iPlayer Radio',
}, {
'url': 'http://www.bbc.co.uk/music/clips/p02frcc3',
'note': 'Audio',
@@ -453,6 +454,7 @@ class BBCCoUkIE(InfoExtractor):
webpage = self._download_webpage(url, group_id, 'Downloading video page')
programme_id = None
duration = None
tviplayer = self._search_regex(
r'mediator\.bind\(({.+?})\s*,\s*document\.getElementById',
@@ -465,14 +467,16 @@ class BBCCoUkIE(InfoExtractor):
if not programme_id:
programme_id = self._search_regex(
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
r'"vpid"\s*:\s*"(%s)"' % self._ID_REGEX, webpage, 'vpid', fatal=False, default=None)
if programme_id:
formats, subtitles = self._download_media_selector(programme_id)
title = self._og_search_title(webpage)
description = self._search_regex(
r'<p class="[^"]*medium-description[^"]*">([^<]+)</p>',
webpage, 'description', fatal=False)
webpage, 'description', default=None)
if not description:
description = self._html_search_meta('description', webpage)
else:
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
@@ -586,6 +590,7 @@ class BBCIE(BBCCoUkIE):
'ext': 'mp4',
'title': '''Judge Mindy Glazer: "I'm sorry to see you here... I always wondered what happened to you"''',
'duration': 56,
'description': '''Judge Mindy Glazer: "I'm sorry to see you here... I always wondered what happened to you"''',
},
'params': {
'skip_download': True,
@@ -728,6 +733,7 @@ class BBCIE(BBCCoUkIE):
# article with multiple videos embedded with playlist.sxml (e.g.
# http://www.bbc.com/sport/0/football/34475836)
playlists = re.findall(r'<param[^>]+name="playlist"[^>]+value="([^"]+)"', webpage)
playlists.extend(re.findall(r'data-media-id="([^"]+/playlist\.sxml)"', webpage))
if playlists:
entries = [
self._extract_from_playlist_sxml(playlist_url, playlist_id, timestamp)
@@ -780,8 +786,9 @@ class BBCIE(BBCCoUkIE):
# single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
programme_id = self._search_regex(
[r'data-video-player-vpid="([\da-z]{8})"',
r'<param[^>]+name="externalIdentifier"[^>]+value="([\da-z]{8})"'],
[r'data-video-player-vpid="(%s)"' % self._ID_REGEX,
r'<param[^>]+name="externalIdentifier"[^>]+value="(%s)"' % self._ID_REGEX,
r'videoId\s*:\s*["\'](%s)["\']' % self._ID_REGEX],
webpage, 'vpid', default=None)
if programme_id:
@@ -816,7 +823,7 @@ class BBCIE(BBCCoUkIE):
# Multiple video article (e.g.
# http://www.bbc.co.uk/blogs/adamcurtis/entries/3662a707-0af9-3149-963f-47bea720b460)
EMBED_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:[^/]+/)+[\da-z]{8}(?:\b[^"]+)?'
EMBED_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:[^/]+/)+%s(?:\b[^"]+)?' % self._ID_REGEX
entries = []
for match in extract_all(r'new\s+SMP\(({.+?})\)'):
embed_url = match.get('playerSettings', {}).get('externalEmbedUrl')

View File

@@ -1,6 +1,11 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_chr,
compat_ord,
compat_urllib_parse_unquote,
)
from ..utils import (
int_or_none,
parse_iso8601,
@@ -29,7 +34,38 @@ class BeegIE(InfoExtractor):
video_id = self._match_id(url)
video = self._download_json(
'http://beeg.com/api/v1/video/%s' % video_id, video_id)
'http://beeg.com/api/v5/video/%s' % video_id, video_id)
def split(o, e):
def cut(s, x):
n.append(s[:x])
return s[x:]
n = []
r = len(o) % e
if r > 0:
o = cut(o, r)
while len(o) > e:
o = cut(o, e)
n.append(o)
return n
def decrypt_key(key):
# Reverse engineered from http://static.beeg.com/cpl/1105.js
a = '5ShMcIQlssOd7zChAIOlmeTZDaUxULbJRnywYaiB'
e = compat_urllib_parse_unquote(key)
o = ''.join([
compat_chr(compat_ord(e[n]) - compat_ord(a[n % len(a)]) % 21)
for n in range(len(e))])
return ''.join(split(o, 3)[::-1])
def decrypt_url(encrypted_url):
encrypted_url = self._proto_relative_url(
encrypted_url.replace('{DATA_MARKERS}', ''), 'http:')
key = self._search_regex(
r'/key=(.*?)%2Cend=', encrypted_url, 'key', default=None)
if not key:
return encrypted_url
return encrypted_url.replace(key, decrypt_key(key))
formats = []
for format_id, video_url in video.items():
@@ -40,7 +76,7 @@ class BeegIE(InfoExtractor):
if not height:
continue
formats.append({
'url': self._proto_relative_url(video_url.replace('{DATA_MARKERS}', ''), 'http:'),
'url': decrypt_url(video_url),
'format_id': format_id,
'height': int(height),
})

View File

@@ -2,143 +2,109 @@
from __future__ import unicode_literals
import re
import itertools
import json
from .common import InfoExtractor
from ..compat import (
compat_etree_fromstring,
)
from ..compat import compat_str
from ..utils import (
int_or_none,
unified_strdate,
unescapeHTML,
ExtractorError,
xpath_text,
)
class BiliBiliIE(InfoExtractor):
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>[0-9]+)/'
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)(?:/index_(?P<page_num>\d+).html)?'
_TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '2c301e4dab317596e837c3e7633e7d86',
'info_dict': {
'id': '1074402_part1',
'id': '1554319',
'ext': 'flv',
'title': '【金坷垃】金泡沫',
'duration': 308,
'duration': 308313,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'timestamp': 1397983878,
'uploader': '菊子桑',
},
}, {
'url': 'http://www.bilibili.com/video/av1041170/',
'info_dict': {
'id': '1041170',
'title': '【BD1080P】刀语【诸神&异域】',
'description': '这是个神奇的故事~每个人不留弹幕不给走哦~切利哦!~',
'uploader': '枫叶逝去',
'timestamp': 1396501299,
},
'playlist_count': 9,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
page_num = mobj.group('page_num') or '1'
if '(此视频不存在或被删除)' in webpage:
raise ExtractorError(
'The video does not exist or was deleted', expected=True)
view_data = self._download_json(
'http://api.bilibili.com/view?type=json&appkey=8e9fc618fbd41e28&id=%s&page=%s' % (video_id, page_num),
video_id)
if 'error' in view_data:
raise ExtractorError('%s said: %s' % (self.IE_NAME, view_data['error']), expected=True)
if '>你没有权限浏览! 由于版权相关问题 我们不对您所在的地区提供服务<' in webpage:
raise ExtractorError(
'The video is not available in your region due to copyright reasons',
expected=True)
cid = view_data['cid']
title = unescapeHTML(view_data['title'])
video_code = self._search_regex(
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')
doc = self._download_xml(
'http://interface.bilibili.com/v_cdn_play?appkey=8e9fc618fbd41e28&cid=%s' % cid,
cid,
'Downloading page %s/%s' % (page_num, view_data['pages'])
)
title = self._html_search_meta(
'media:title', video_code, 'title', fatal=True)
duration_str = self._html_search_meta(
'duration', video_code, 'duration')
if duration_str is None:
duration = None
else:
duration_mobj = re.match(
r'^T(?:(?P<hours>[0-9]+)H)?(?P<minutes>[0-9]+)M(?P<seconds>[0-9]+)S$',
duration_str)
duration = (
int_or_none(duration_mobj.group('hours'), default=0) * 3600 +
int(duration_mobj.group('minutes')) * 60 +
int(duration_mobj.group('seconds')))
upload_date = unified_strdate(self._html_search_meta(
'uploadDate', video_code, fatal=False))
thumbnail = self._html_search_meta(
'thumbnailUrl', video_code, 'thumbnail', fatal=False)
cid = self._search_regex(r'cid=(\d+)', webpage, 'cid')
if xpath_text(doc, './result') == 'error':
raise ExtractorError('%s said: %s' % (self.IE_NAME, xpath_text(doc, './message')), expected=True)
entries = []
lq_page = self._download_webpage(
'http://interface.bilibili.com/v_cdn_play?appkey=1&cid=%s' % cid,
video_id,
note='Downloading LQ video info'
)
try:
err_info = json.loads(lq_page)
raise ExtractorError(
'BiliBili said: ' + err_info['error_text'], expected=True)
except ValueError:
pass
lq_doc = compat_etree_fromstring(lq_page)
lq_durls = lq_doc.findall('./durl')
hq_doc = self._download_xml(
'http://interface.bilibili.com/playurl?appkey=1&cid=%s' % cid,
video_id,
note='Downloading HQ video info',
fatal=False,
)
if hq_doc is not False:
hq_durls = hq_doc.findall('./durl')
assert len(lq_durls) == len(hq_durls)
else:
hq_durls = itertools.repeat(None)
i = 1
for lq_durl, hq_durl in zip(lq_durls, hq_durls):
for durl in doc.findall('./durl'):
size = xpath_text(durl, ['./filesize', './size'])
formats = [{
'format_id': 'lq',
'quality': 1,
'url': lq_durl.find('./url').text,
'filesize': int_or_none(
lq_durl.find('./size'), get_attr='text'),
'url': durl.find('./url').text,
'filesize': int_or_none(size),
'ext': 'flv',
}]
if hq_durl is not None:
formats.append({
'format_id': 'hq',
'quality': 2,
'ext': 'flv',
'url': hq_durl.find('./url').text,
'filesize': int_or_none(
hq_durl.find('./size'), get_attr='text'),
})
self._sort_formats(formats)
backup_urls = durl.find('./backup_url')
if backup_urls is not None:
for backup_url in backup_urls.findall('./url'):
formats.append({'url': backup_url.text})
formats.reverse()
entries.append({
'id': '%s_part%d' % (video_id, i),
'id': '%s_part%s' % (cid, xpath_text(durl, './order')),
'title': title,
'duration': int_or_none(xpath_text(durl, './length'), 1000),
'formats': formats,
'duration': duration,
'upload_date': upload_date,
'thumbnail': thumbnail,
})
i += 1
return {
'_type': 'multi_video',
'entries': entries,
'id': video_id,
'title': title
info = {
'id': compat_str(cid),
'title': title,
'description': view_data.get('description'),
'thumbnail': view_data.get('pic'),
'uploader': view_data.get('author'),
'timestamp': int_or_none(view_data.get('created')),
'view_count': int_or_none(view_data.get('play')),
'duration': int_or_none(xpath_text(doc, './timelength')),
}
if len(entries) == 1:
entries[0].update(info)
return entries[0]
else:
info.update({
'_type': 'multi_video',
'id': video_id,
'entries': entries,
})
return info

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
class BloombergIE(InfoExtractor):
_VALID_URL = r'https?://www\.bloomberg\.com/news/[^/]+/[^/]+/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?bloomberg\.com/(?:[^/]+/)*(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.bloomberg.com/news/videos/b/aaeae121-5949-481e-a1ce-4562db6f5df2',
@@ -20,22 +20,36 @@ class BloombergIE(InfoExtractor):
}, {
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
'only_matching': True,
}, {
'url': 'http://www.bloomberg.com/politics/videos/2015-11-25/karl-rove-on-jeb-bush-s-struggles-stopping-trump',
'only_matching': True,
}]
def _real_extract(self, url):
name = self._match_id(url)
webpage = self._download_webpage(url, name)
video_id = self._search_regex(r'"bmmrId":"(.+?)"', webpage, 'id')
video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
webpage, 'id', group='url')
title = re.sub(': Video$', '', self._og_search_title(webpage))
embed_info = self._download_json(
'http://www.bloomberg.com/api/embed?id=%s' % video_id, video_id)
formats = []
for stream in embed_info['streams']:
if stream["muxing_format"] == "TS":
formats.extend(self._extract_m3u8_formats(stream['url'], video_id))
stream_url = stream.get('url')
if not stream_url:
continue
if stream['muxing_format'] == 'TS':
m3u8_formats = self._extract_m3u8_formats(
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
else:
formats.extend(self._extract_f4m_formats(stream['url'], video_id))
f4m_formats = self._extract_f4m_formats(
stream_url, video_id, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
self._sort_formats(formats)
return {

View File

@@ -14,9 +14,10 @@ class BYUtvIE(InfoExtractor):
'info_dict': {
'id': 'studio-c-season-5-episode-5',
'ext': 'mp4',
'description': 'md5:5438d33774b6bdc662f9485a340401cc',
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
'title': 'Season 5 Episode 5',
'thumbnail': 're:^https?://.*\.jpg$'
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 1486.486,
},
'params': {
'skip_download': True,

View File

@@ -1,14 +1,9 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
js_to_json,
parse_iso8601,
remove_end,
unified_strdate,
)
@@ -21,48 +16,47 @@ class ClipfishIE(InfoExtractor):
'id': '3966754',
'ext': 'mp4',
'title': 'FIFA 14 - E3 2013 Trailer',
'timestamp': 1370938118,
'description': 'Video zu FIFA 14: E3 2013 Trailer',
'upload_date': '20130611',
'duration': 82,
'view_count': int,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_info = self._parse_json(
js_to_json(self._html_search_regex(
'(?s)videoObject\s*=\s*({.+?});', webpage, 'video object')),
video_id)
video_info = self._download_json(
'http://www.clipfish.de/devapi/id/%s?format=json&apikey=hbbtv' % video_id,
video_id)['items'][0]
formats = []
for video_url in re.findall(r'var\s+videourl\s*=\s*"([^"]+)"', webpage):
ext = determine_ext(video_url)
if ext == 'm3u8':
formats.append({
'url': video_url.replace('de.hls.fra.clipfish.de', 'hls.fra.clipfish.de'),
'ext': 'mp4',
'format_id': 'hls',
})
else:
formats.append({
'url': video_url,
'format_id': ext,
})
self._sort_formats(formats)
title = remove_end(self._og_search_title(webpage), ' - Video')
thumbnail = self._og_search_thumbnail(webpage)
duration = int_or_none(video_info.get('length'))
timestamp = parse_iso8601(self._html_search_meta('uploadDate', webpage, 'upload date'))
m3u8_url = video_info.get('media_videourl_hls')
if m3u8_url:
formats.append({
'url': m3u8_url.replace('de.hls.fra.clipfish.de', 'hls.fra.clipfish.de'),
'ext': 'mp4',
'format_id': 'hls',
})
mp4_url = video_info.get('media_videourl')
if mp4_url:
formats.append({
'url': mp4_url,
'format_id': 'mp4',
'width': int_or_none(video_info.get('width')),
'height': int_or_none(video_info.get('height')),
'tbr': int_or_none(video_info.get('bitrate')),
})
return {
'id': video_id,
'title': title,
'title': video_info['title'],
'description': video_info.get('descr'),
'formats': formats,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'thumbnail': video_info.get('media_content_thumbnail_large') or video_info.get('media_thumbnail'),
'duration': int_or_none(video_info.get('media_length')),
'upload_date': unified_strdate(video_info.get('pubDate')),
'view_count': int_or_none(video_info.get('media_views'))
}

View File

@@ -1,7 +1,7 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import determine_ext
from ..utils import int_or_none
_translation_table = {
@@ -42,31 +42,26 @@ class CliphunterIE(InfoExtractor):
video_title = self._search_regex(
r'mediaTitle = "([^"]+)"', webpage, 'title')
fmts = {}
for fmt in ('mp4', 'flv'):
fmt_list = self._parse_json(self._search_regex(
r'var %sjson\s*=\s*(\[.*?\]);' % fmt, webpage, '%s formats' % fmt), video_id)
for f in fmt_list:
fmts[f['fname']] = _decode(f['sUrl'])
qualities = self._parse_json(self._search_regex(
r'var player_btns\s*=\s*(.*?);\n', webpage, 'quality info'), video_id)
gexo_files = self._parse_json(
self._search_regex(
r'var\s+gexoFiles\s*=\s*({.+?});', webpage, 'gexo files'),
video_id)
formats = []
for fname, url in fmts.items():
f = {
'url': url,
}
if fname in qualities:
qual = qualities[fname]
f.update({
'format_id': '%s_%sp' % (determine_ext(url), qual['h']),
'width': qual['w'],
'height': qual['h'],
'tbr': qual['br'],
})
formats.append(f)
for format_id, f in gexo_files.items():
video_url = f.get('url')
if not video_url:
continue
fmt = f.get('fmt')
height = f.get('h')
format_id = '%s_%sp' % (fmt, height) if fmt and height else format_id
formats.append({
'url': _decode(video_url),
'format_id': format_id,
'width': int_or_none(f.get('w')),
'height': int_or_none(height),
'tbr': int_or_none(f.get('br')),
})
self._sort_formats(formats)
thumbnail = self._search_regex(

View File

@@ -167,7 +167,7 @@ class InfoExtractor(object):
"ext" will be calculated from URL if missing
automatic_captions: Like 'subtitles', used by the YoutubeIE for
automatically generated captions
duration: Length of the video in seconds, as an integer.
duration: Length of the video in seconds, as an integer or float.
view_count: How many users have watched the video on the platform.
like_count: Number of positive ratings of the video
dislike_count: Number of negative ratings of the video

View File

@@ -9,6 +9,7 @@ from ..utils import (
find_xpath_attr,
smuggle_url,
determine_ext,
ExtractorError,
)
from .senateisvp import SenateISVPIE
@@ -18,33 +19,32 @@ class CSpanIE(InfoExtractor):
IE_DESC = 'C-SPAN'
_TESTS = [{
'url': 'http://www.c-span.org/video/?313572-1/HolderonV',
'md5': '8e44ce11f0f725527daccc453f553eb0',
'md5': '94b29a4f131ff03d23471dd6f60b6a1d',
'info_dict': {
'id': '315139',
'ext': 'mp4',
'title': 'Attorney General Eric Holder on Voting Rights Act Decision',
'description': 'Attorney General Eric Holder spoke to reporters following the Supreme Court decision in Shelby County v. Holder in which the court ruled that the preclearance provisions of the Voting Rights Act could not be enforced until Congress established new guidelines for review.',
'description': 'Attorney General Eric Holder speaks to reporters following the Supreme Court decision in [Shelby County v. Holder], in which the court ruled that the preclearance provisions of the Voting Rights Act could not be enforced.',
},
'skip': 'Regularly fails on travis, for unknown reasons',
}, {
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
# For whatever reason, the served video alternates between
# two different ones
'md5': '8e5fbfabe6ad0f89f3012a7943c1287b',
'info_dict': {
'id': '340723',
'id': 'c4486943',
'ext': 'mp4',
'title': 'International Health Care Models',
'title': 'CSPAN - International Health Care Models',
'description': 'md5:7a985a2d595dba00af3d9c9f0783c967',
}
}, {
'url': 'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall',
'md5': '446562a736c6bf97118e389433ed88d4',
'md5': '2ae5051559169baadba13fc35345ae74',
'info_dict': {
'id': '342759',
'ext': 'mp4',
'title': 'General Motors Ignition Switch Recall',
'duration': 14848,
'description': 'md5:70c7c3b8fa63fa60d42772440596034c'
'description': 'md5:118081aedd24bf1d3b68b3803344e7f3'
},
}, {
# Video from senate.gov
@@ -57,67 +57,77 @@ class CSpanIE(InfoExtractor):
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
page_id = mobj.group('id')
webpage = self._download_webpage(url, page_id)
video_id = self._search_regex(r'progid=\'?([0-9]+)\'?>', webpage, 'video id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
matches = re.search(r'data-(prog|clip)id=\'([0-9]+)\'', webpage)
if matches:
video_type, video_id = matches.groups()
if video_type == 'prog':
video_type = 'program'
else:
senate_isvp_url = SenateISVPIE._search_iframe_url(webpage)
if senate_isvp_url:
title = self._og_search_title(webpage)
surl = smuggle_url(senate_isvp_url, {'force_title': title})
return self.url_result(surl, 'SenateISVP', video_id, title)
description = self._html_search_regex(
[
# The full description
r'<div class=\'expandable\'>(.*?)<a href=\'#\'',
# If the description is small enough the other div is not
# present, otherwise this is a stripped version
r'<p class=\'initial\'>(.*?)</p>'
],
webpage, 'description', flags=re.DOTALL, default=None)
def get_text_attr(d, attr):
return d.get(attr, {}).get('#text')
info_url = 'http://c-spanvideo.org/videoLibrary/assets/player/ajax-player.php?os=android&html5=program&id=' + video_id
data = self._download_json(info_url, video_id)
data = self._download_json(
'http://www.c-span.org/assets/player/ajax-player.php?os=android&html5=%s&id=%s' % (video_type, video_id),
video_id)['video']
if data['@status'] != 'Success':
raise ExtractorError('%s said: %s' % (self.IE_NAME, get_text_attr(data, 'error')), expected=True)
doc = self._download_xml(
'http://www.c-span.org/common/services/flashXml.php?programid=' + video_id,
'http://www.c-span.org/common/services/flashXml.php?%sid=%s' % (video_type, video_id),
video_id)
description = self._html_search_meta('description', webpage)
title = find_xpath_attr(doc, './/string', 'name', 'title').text
thumbnail = find_xpath_attr(doc, './/string', 'name', 'poster').text
senate_isvp_url = SenateISVPIE._search_iframe_url(webpage)
if senate_isvp_url:
surl = smuggle_url(senate_isvp_url, {'force_title': title})
return self.url_result(surl, 'SenateISVP', video_id, title)
files = data['files']
capfile = get_text_attr(data, 'capfile')
files = data['video']['files']
try:
capfile = data['video']['capfile']['#text']
except KeyError:
capfile = None
entries = [{
'id': '%s_%d' % (video_id, partnum + 1),
'title': (
title if len(files) == 1 else
'%s part %d' % (title, partnum + 1)),
'url': unescapeHTML(f['path']['#text']),
'description': description,
'thumbnail': thumbnail,
'duration': int_or_none(f.get('length', {}).get('#text')),
'subtitles': {
'en': [{
'url': capfile,
'ext': determine_ext(capfile, 'dfxp')
}],
} if capfile else None,
} for partnum, f in enumerate(files)]
entries = []
for partnum, f in enumerate(files):
formats = []
for quality in f['qualities']:
formats.append({
'format_id': '%s-%sp' % (get_text_attr(quality, 'bitrate'), get_text_attr(quality, 'height')),
'url': unescapeHTML(get_text_attr(quality, 'file')),
'height': int_or_none(get_text_attr(quality, 'height')),
'tbr': int_or_none(get_text_attr(quality, 'bitrate')),
})
self._sort_formats(formats)
entries.append({
'id': '%s_%d' % (video_id, partnum + 1),
'title': (
title if len(files) == 1 else
'%s part %d' % (title, partnum + 1)),
'formats': formats,
'description': description,
'thumbnail': thumbnail,
'duration': int_or_none(get_text_attr(f, 'length')),
'subtitles': {
'en': [{
'url': capfile,
'ext': determine_ext(capfile, 'dfxp')
}],
} if capfile else None,
})
if len(entries) == 1:
entry = dict(entries[0])
entry['id'] = video_id
entry['id'] = 'c' + video_id if video_type == 'clip' else video_id
return entry
else:
return {
'_type': 'playlist',
'entries': entries,
'title': title,
'id': video_id,
'id': 'c' + video_id if video_type == 'clip' else video_id,
}

View File

@@ -99,6 +99,11 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
{
'url': 'http://www.dailymotion.com/video/xhza0o',
'only_matching': True,
},
# with subtitles
{
'url': 'http://www.dailymotion.com/video/x20su5f_the-power-of-nightmares-1-the-rise-of-the-politics-of-fear-bbc-2004_news',
'only_matching': True,
}
]
@@ -122,7 +127,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
webpage, 'comment count', fatal=False))
player_v5 = self._search_regex(
[r'buildPlayer\(({.+?})\);', r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);'],
[r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826
r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);',
r'buildPlayer\(({.+?})\);'],
webpage, 'player v5', default=None)
if player_v5:
player = self._parse_json(player_v5, video_id)
@@ -172,11 +179,13 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
uploader_id = metadata.get('owner', {}).get('id')
subtitles = {}
for subtitle_lang, subtitle in metadata.get('subtitles', {}).get('data', {}).items():
subtitles[subtitle_lang] = [{
'ext': determine_ext(subtitle_url),
'url': subtitle_url,
} for subtitle_url in subtitle.get('urls', [])]
subtitles_data = metadata.get('subtitles', {}).get('data', {})
if subtitles_data and isinstance(subtitles_data, dict):
for subtitle_lang, subtitle in subtitles_data.items():
subtitles[subtitle_lang] = [{
'ext': determine_ext(subtitle_url),
'url': subtitle_url,
} for subtitle_url in subtitle.get('urls', [])]
return {
'id': video_id,

View File

@@ -13,8 +13,8 @@ from ..utils import (
class DBTVIE(InfoExtractor):
_VALID_URL = r'http://dbtv\.no/(?P<id>[0-9]+)#(?P<display_id>.+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?dbtv\.no/(?:(?:lazyplayer|player)/)?(?P<id>[0-9]+)(?:#(?P<display_id>.+))?'
_TESTS = [{
'url': 'http://dbtv.no/3649835190001#Skulle_teste_ut_fornøyelsespark,_men_kollegaen_var_bare_opptatt_av_bikinikroppen',
'md5': 'b89953ed25dacb6edb3ef6c6f430f8bc',
'info_dict': {
@@ -30,12 +30,18 @@ class DBTVIE(InfoExtractor):
'view_count': int,
'categories': list,
}
}
}, {
'url': 'http://dbtv.no/3649835190001',
'only_matching': True,
}, {
'url': 'http://www.dbtv.no/lazyplayer/4631135248001',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('display_id')
display_id = mobj.group('display_id') or video_id
data = self._download_json(
'http://api.dbtv.no/discovery/%s' % video_id, display_id)

View File

@@ -164,7 +164,7 @@ class FacebookIE(InfoExtractor):
if not video_title:
video_title = self._html_search_regex(
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',
webpage, 'alternative title', fatal=False)
webpage, 'alternative title', default=None)
video_title = limit_length(video_title, 80)
if not video_title:
video_title = 'Facebook video #%s' % video_id

View File

@@ -37,8 +37,8 @@ class FC2IE(InfoExtractor):
'params': {
'username': 'ytdl@yt-dl.org',
'password': '(snip)',
'skip': 'requires actual password'
}
},
'skip': 'requires actual password',
}, {
'url': 'http://video.fc2.com/en/a/content/20130926eZpARwsF',
'only_matching': True,

View File

@@ -0,0 +1,193 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
clean_html,
determine_ext,
encode_dict,
int_or_none,
sanitized_Request,
ExtractorError,
urlencode_postdata
)
class FunimationIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?funimation\.com/shows/[^/]+/videos/(?:official|promotional)/(?P<id>[^/?#&]+)'
_NETRC_MACHINE = 'funimation'
_TESTS = [{
'url': 'http://www.funimation.com/shows/air/videos/official/breeze',
'info_dict': {
'id': '658',
'display_id': 'breeze',
'ext': 'mp4',
'title': 'Air - 1 - Breeze',
'description': 'md5:1769f43cd5fc130ace8fd87232207892',
'thumbnail': 're:https?://.*\.jpg',
},
}, {
'url': 'http://www.funimation.com/shows/hacksign/videos/official/role-play',
'info_dict': {
'id': '31128',
'display_id': 'role-play',
'ext': 'mp4',
'title': '.hack//SIGN - 1 - Role Play',
'description': 'md5:b602bdc15eef4c9bbb201bb6e6a4a2dd',
'thumbnail': 're:https?://.*\.jpg',
},
}, {
'url': 'http://www.funimation.com/shows/attack-on-titan-junior-high/videos/promotional/broadcast-dub-preview',
'info_dict': {
'id': '9635',
'display_id': 'broadcast-dub-preview',
'ext': 'mp4',
'title': 'Attack on Titan: Junior High - Broadcast Dub Preview',
'description': 'md5:f8ec49c0aff702a7832cd81b8a44f803',
'thumbnail': 're:https?://.*\.(?:jpg|png)',
},
}]
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
data = urlencode_postdata(encode_dict({
'email_field': username,
'password_field': password,
}))
login_request = sanitized_Request('http://www.funimation.com/login', data, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 5.2; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0',
'Content-Type': 'application/x-www-form-urlencoded'
})
login_page = self._download_webpage(
login_request, None, 'Logging in as %s' % username)
if any(p in login_page for p in ('funimation.com/logout', '>Log Out<')):
return
error = self._html_search_regex(
r'(?s)<div[^>]+id=["\']errorMessages["\'][^>]*>(.+?)</div>',
login_page, 'error messages', default=None)
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
def _real_extract(self, url):
display_id = self._match_id(url)
errors = []
formats = []
ERRORS_MAP = {
'ERROR_MATURE_CONTENT_LOGGED_IN': 'matureContentLoggedIn',
'ERROR_MATURE_CONTENT_LOGGED_OUT': 'matureContentLoggedOut',
'ERROR_SUBSCRIPTION_LOGGED_OUT': 'subscriptionLoggedOut',
'ERROR_VIDEO_EXPIRED': 'videoExpired',
'ERROR_TERRITORY_UNAVAILABLE': 'territoryUnavailable',
'SVODBASIC_SUBSCRIPTION_IN_PLAYER': 'basicSubscription',
'SVODNON_SUBSCRIPTION_IN_PLAYER': 'nonSubscription',
'ERROR_PLAYER_NOT_RESPONDING': 'playerNotResponding',
'ERROR_UNABLE_TO_CONNECT_TO_CDN': 'unableToConnectToCDN',
'ERROR_STREAM_NOT_FOUND': 'streamNotFound',
}
USER_AGENTS = (
# PC UA is served with m3u8 that provides some bonus lower quality formats
('pc', 'Mozilla/5.0 (Windows NT 5.2; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0'),
# Mobile UA allows to extract direct links and also does not fail when
# PC UA fails with hulu error (e.g.
# http://www.funimation.com/shows/hacksign/videos/official/role-play)
('mobile', 'Mozilla/5.0 (Linux; Android 4.4.2; Nexus 4 Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.114 Mobile Safari/537.36'),
)
for kind, user_agent in USER_AGENTS:
request = sanitized_Request(url)
request.add_header('User-Agent', user_agent)
webpage = self._download_webpage(
request, display_id, 'Downloading %s webpage' % kind)
playlist = self._parse_json(
self._search_regex(
r'var\s+playersData\s*=\s*(\[.+?\]);\n',
webpage, 'players data'),
display_id)[0]['playlist']
items = next(item['items'] for item in playlist if item.get('items'))
item = next(item for item in items if item.get('itemAK') == display_id)
error_messages = {}
video_error_messages = self._search_regex(
r'var\s+videoErrorMessages\s*=\s*({.+?});\n',
webpage, 'error messages', default=None)
if video_error_messages:
error_messages_json = self._parse_json(video_error_messages, display_id, fatal=False)
if error_messages_json:
for _, error in error_messages_json.items():
type_ = error.get('type')
description = error.get('description')
content = error.get('content')
if type_ == 'text' and description and content:
error_message = ERRORS_MAP.get(description)
if error_message:
error_messages[error_message] = content
for video in item.get('videoSet', []):
auth_token = video.get('authToken')
if not auth_token:
continue
funimation_id = video.get('FUNImationID') or video.get('videoId')
preference = 1 if video.get('languageMode') == 'dub' else 0
if not auth_token.startswith('?'):
auth_token = '?%s' % auth_token
for quality, height in (('sd', 480), ('hd', 720), ('hd1080', 1080)):
format_url = video.get('%sUrl' % quality)
if not format_url:
continue
if not format_url.startswith(('http', '//')):
errors.append(format_url)
continue
if determine_ext(format_url) == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
format_url + auth_token, display_id, 'mp4', entry_protocol='m3u8_native',
preference=preference, m3u8_id='%s-hls' % funimation_id, fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
else:
tbr = int_or_none(self._search_regex(
r'-(\d+)[Kk]', format_url, 'tbr', default=None))
formats.append({
'url': format_url + auth_token,
'format_id': '%s-http-%dp' % (funimation_id, height),
'height': height,
'tbr': tbr,
'preference': preference,
})
if not formats and errors:
raise ExtractorError(
'%s returned error: %s'
% (self.IE_NAME, clean_html(error_messages.get(errors[0], errors[0]))),
expected=True)
self._sort_formats(formats)
title = item['title']
artist = item.get('artist')
if artist:
title = '%s - %s' % (artist, title)
description = self._og_search_description(webpage) or item.get('description')
thumbnail = self._og_search_thumbnail(webpage) or item.get('posterUrl')
video_id = item.get('itemId') or display_id
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'formats': formats,
}

View File

@@ -0,0 +1,43 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none
class GameInformerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gameinformer\.com/(?:[^/]+/)*(?P<id>.+)\.aspx'
_TEST = {
'url': 'http://www.gameinformer.com/b/features/archive/2015/09/26/replay-animal-crossing.aspx',
'info_dict': {
'id': '4515472681001',
'ext': 'm3u8',
'title': 'Replay - Animal Crossing',
'description': 'md5:2e211891b215c85d061adc7a4dd2d930',
'timestamp': 1443457610706,
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
bc_api_url = self._search_regex(r"getVideo\('([^']+)'", webpage, 'brightcove api url')
json_data = self._download_json(
bc_api_url + '&video_fields=id,name,shortDescription,publishedDate,videoStillURL,length,IOSRenditions',
display_id)
return {
'id': compat_str(json_data['id']),
'display_id': display_id,
'url': json_data['IOSRenditions'][0]['url'],
'title': json_data['name'],
'description': json_data.get('shortDescription'),
'timestamp': int_or_none(json_data.get('publishedDate')),
'duration': int_or_none(json_data.get('length')),
}

View File

@@ -1,19 +1,62 @@
from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_age_limit,
url_basename,
)
class GametrailersIE(MTVServicesInfoExtractor):
_VALID_URL = r'http://www\.gametrailers\.com/(?P<type>videos|reviews|full-episodes)/(?P<id>.*?)/(?P<title>.*)'
class GametrailersIE(InfoExtractor):
_VALID_URL = r'http://www\.gametrailers\.com/videos/view/[^/]+/(?P<id>.+)'
_TEST = {
'url': 'http://www.gametrailers.com/videos/zbvr8i/mirror-s-edge-2-e3-2013--debut-trailer',
'md5': '4c8e67681a0ea7ec241e8c09b3ea8cf7',
'url': 'http://www.gametrailers.com/videos/view/gametrailers-com/116437-Just-Cause-3-Review',
'md5': 'f28c4efa0bdfaf9b760f6507955b6a6a',
'info_dict': {
'id': '70e9a5d7-cf25-4a10-9104-6f3e7342ae0d',
'id': '2983958',
'ext': 'mp4',
'title': 'E3 2013: Debut Trailer',
'description': 'Faith is back! Check out the World Premiere trailer for Mirror\'s Edge 2 straight from the EA Press Conference at E3 2013!',
'display_id': '116437-Just-Cause-3-Review',
'title': 'Just Cause 3 - Review',
'description': 'It\'s a lot of fun to shoot at things and then watch them explode in Just Cause 3, but should there be more to the experience than that?',
},
}
_FEED_URL = 'http://www.gametrailers.com/feeds/mrss'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._html_search_regex(
r'<title>(.+?)\|', webpage, 'title').strip()
embed_url = self._proto_relative_url(
self._search_regex(
r'src=\'(//embed.gametrailers.com/embed/[^\']+)\'', webpage,
'embed url'),
scheme='http:')
video_id = url_basename(embed_url)
embed_page = self._download_webpage(embed_url, video_id)
embed_vars_json = self._search_regex(
r'(?s)var embedVars = (\{.*?\})\s*</script>', embed_page,
'embed vars')
info = self._parse_json(embed_vars_json, video_id)
formats = []
for media in info['media']:
if media['mediaPurpose'] == 'play':
formats.append({
'url': media['uri'],
'height': media['height'],
'width:': media['width'],
})
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id,
'title': title,
'formats': formats,
'thumbnail': info.get('thumbUri'),
'description': self._og_search_description(webpage),
'duration': int_or_none(info.get('videoLengthInSeconds')),
'age_limit': parse_age_limit(info.get('audienceRating')),
}

View File

@@ -54,6 +54,7 @@ from .onionstudios import OnionStudiosIE
from .snagfilms import SnagFilmsEmbedIE
from .screenwavemedia import ScreenwaveMediaIE
from .mtv import MTVServicesEmbeddedIE
from .pladform import PladformIE
class GenericIE(InfoExtractor):
@@ -339,6 +340,7 @@ class GenericIE(InfoExtractor):
'id': 'BwY2RxaTrTkslxOfcan0UCf0YqyvWysJ',
'ext': 'mp4',
'title': '2cc213299525360.mov', # that's what we get
'duration': 238.231,
},
'add_ie': ['Ooyala'],
},
@@ -350,6 +352,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': '"Steve Jobs: Man in the Machine" trailer',
'description': 'The first trailer for the Alex Gibney documentary "Steve Jobs: Man in the Machine."',
'duration': 135.427,
},
'params': {
'skip_download': True,
@@ -960,8 +963,9 @@ class GenericIE(InfoExtractor):
'info_dict': {
'id': '50YnY4czr4ms1vJ7yz3xzq0excz_pUMs',
'ext': 'mp4',
'description': 'VIDEO: Index/Match versus VLOOKUP.',
'description': 'VIDEO: INDEX/MATCH versus VLOOKUP.',
'title': 'This is what separates the Excel masters from the wannabes',
'duration': 191.933,
},
'params': {
# m3u8 downloads
@@ -1501,7 +1505,7 @@ class GenericIE(InfoExtractor):
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage) or
re.search(r'data-ooyala-video-id\s*=\s*[\'"](?P<ec>.{32})[\'"]', webpage))
if mobj is not None:
return OoyalaIE._build_url_result(mobj.group('ec'))
return OoyalaIE._build_url_result(smuggle_url(mobj.group('ec'), {'domain': url}))
# Look for multiple Ooyala embeds on SBN network websites
mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
@@ -1509,7 +1513,7 @@ class GenericIE(InfoExtractor):
embeds = self._parse_json(mobj.group(1), video_id, fatal=False)
if embeds:
return _playlist_from_matches(
embeds, getter=lambda v: OoyalaIE._url_for_embed_code(v['provider_video_id']), ie='Ooyala')
embeds, getter=lambda v: OoyalaIE._url_for_embed_code(smuggle_url(v['provider_video_id'], {'domain': url})), ie='Ooyala')
# Look for Aparat videos
mobj = re.search(r'<iframe .*?src="(http://www\.aparat\.com/video/[^"]+)"', webpage)
@@ -1738,10 +1742,9 @@ class GenericIE(InfoExtractor):
return self.url_result('eagleplatform:%(host)s:%(id)s' % mobj.groupdict(), 'EaglePlatform')
# Look for Pladform embeds
mobj = re.search(
r'<iframe[^>]+src="(?P<url>https?://out\.pladform\.ru/player\?.+?)"', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'), 'Pladform')
pladform_url = PladformIE._extract_url(webpage)
if pladform_url:
return self.url_result(pladform_url)
# Look for Playwire embeds
mobj = re.search(

View File

@@ -18,6 +18,8 @@ class GrouponIE(InfoExtractor):
'id': 'tubGNycTo_9Uxg82uESj4i61EYX8nyuf',
'ext': 'mp4',
'title': 'Bikram Yoga Huntington Beach | Orange County',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'duration': 44.961,
},
}],
'params': {

View File

@@ -16,6 +16,7 @@ class HowcastIE(InfoExtractor):
'description': 'md5:dbe792e5f6f1489027027bf2eba188a3',
'timestamp': 1276081287,
'upload_date': '20100609',
'duration': 56.823,
},
'params': {
# m3u8 download

View File

@@ -28,15 +28,12 @@ class HypemIE(InfoExtractor):
track_id = self._match_id(url)
data = {'ax': 1, 'ts': time.time()}
data_encoded = compat_urllib_parse.urlencode(data)
complete_url = url + "?" + data_encoded
request = sanitized_Request(complete_url)
request = sanitized_Request(url + '?' + compat_urllib_parse.urlencode(data))
response, urlh = self._download_webpage_handle(
request, track_id, 'Downloading webpage with the url')
cookie = urlh.headers.get('Set-Cookie', '')
html_tracks = self._html_search_regex(
r'(?ms)<script type="application/json" id="displayList-data">\s*(.*?)\s*</script>',
r'(?ms)<script type="application/json" id="displayList-data">(.+?)</script>',
response, 'tracks')
try:
track_list = json.loads(html_tracks)
@@ -46,15 +43,14 @@ class HypemIE(InfoExtractor):
key = track['key']
track_id = track['id']
artist = track['artist']
title = track['song']
serve_url = "http://hypem.com/serve/source/%s/%s" % (track_id, key)
request = sanitized_Request(
serve_url, '', {'Content-Type': 'application/json'})
request.add_header('cookie', cookie)
'http://hypem.com/serve/source/%s/%s' % (track_id, key),
'', {'Content-Type': 'application/json'})
song_data = self._download_json(request, track_id, 'Downloading metadata')
final_url = song_data["url"]
final_url = song_data['url']
artist = track.get('artist')
return {
'id': track_id,

View File

@@ -3,10 +3,7 @@ from __future__ import unicode_literals
import base64
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_unquote,
compat_urlparse,
)
from ..compat import compat_urllib_parse_unquote
class InfoQIE(InfoExtractor):
@@ -45,9 +42,11 @@ class InfoQIE(InfoExtractor):
video_filename = playpath.split('/')[-1]
video_id, extension = video_filename.split('.')
http_base = self._search_regex(
r'EXPRESSINSTALL_SWF\s*=\s*[^"]*"((?:https?:)?//[^/"]+/)', webpage,
'HTTP base URL')
http_video_url = self._search_regex(r'P\.s\s*=\s*\'([^\']+)\'', webpage, 'video URL')
policy = self._search_regex(r'InfoQConstants.scp\s*=\s*\'([^\']+)\'', webpage, 'policy')
signature = self._search_regex(r'InfoQConstants.scs\s*=\s*\'([^\']+)\'', webpage, 'signature')
key_pair_id = self._search_regex(r'InfoQConstants.sck\s*=\s*\'([^\']+)\'', webpage, 'key-pair-id')
formats = [{
'format_id': 'rtmp',
@@ -56,7 +55,11 @@ class InfoQIE(InfoExtractor):
'play_path': playpath,
}, {
'format_id': 'http',
'url': compat_urlparse.urljoin(url, http_base) + real_id,
'url': http_video_url,
'http_headers': {
'Cookie': 'CloudFront-Policy=%s; CloudFront-Signature=%s; CloudFront-Key-Pair-Id=%s' % (
policy, signature, key_pair_id),
},
}]
self._sort_formats(formats)

View File

@@ -205,9 +205,8 @@ class IqiyiIE(InfoExtractor):
def get_enc_key(self, swf_url, video_id):
# TODO: automatic key extraction
# last update at 2015-10-22 for Zombie::bite
# '7223c67061dbea1259d0ceb44f44b6d62288f4f80c972170de5201d2321060270e05'[2:66][0::2]
enc_key = '2c76de15dcb44bd28ff0927d50d31620'
# last update at 2015-12-06 for Zombie::bite
enc_key = '3719f6a1da83ee0aee3488d8802d7696'[::-1]
return enc_key
def _real_extract(self, url):

View File

@@ -1,23 +1,25 @@
from __future__ import unicode_literals
import os
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import sanitized_Request
from ..utils import (
sanitized_Request,
url_basename,
)
class KeezMoviesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?keezmovies\.com/video/.+?(?P<id>[0-9]+)(?:[/?&]|$)'
_TEST = {
'url': 'http://www.keezmovies.com/video/petite-asian-lady-mai-playing-in-bathtub-1214711',
'md5': '6e297b7e789329923fcf83abb67c9289',
'md5': '1c1e75d22ffa53320f45eeb07bc4cdc0',
'info_dict': {
'id': '1214711',
'ext': 'mp4',
'title': 'Petite Asian Lady Mai Playing In Bathtub',
'age_limit': 18,
'thumbnail': 're:^https?://.*\.jpg$',
}
}
@@ -36,21 +38,29 @@ class KeezMoviesIE(InfoExtractor):
video_title = self._html_search_regex(
r'<h1 [^>]*>([^<]+)', webpage, 'title')
video_url = self._html_search_regex(
r'(?s)html5VideoPlayer = .*?src="([^"]+)"', webpage, 'video URL')
path = compat_urllib_parse_urlparse(video_url).path
extension = os.path.splitext(path)[1][1:]
format = path.split('/')[4].split('_')[:2]
format = "-".join(format)
flashvars = self._parse_json(self._search_regex(
r'var\s+flashvars\s*=\s*([^;]+);', webpage, 'flashvars'), video_id)
formats = []
for height in (180, 240, 480):
if flashvars.get('quality_%dp' % height):
video_url = flashvars['quality_%dp' % height]
a_format = {
'url': video_url,
'height': height,
'format_id': '%dp' % height,
}
filename_parts = url_basename(video_url).split('_')
if len(filename_parts) >= 2 and re.match(r'\d+[Kk]', filename_parts[1]):
a_format['tbr'] = int(filename_parts[1][:-1])
formats.append(a_format)
age_limit = self._rta_search(webpage)
return {
'id': video_id,
'title': video_title,
'url': video_url,
'ext': extension,
'format': format,
'format_id': format,
'formats': formats,
'age_limit': age_limit,
'thumbnail': flashvars.get('image_url')
}

View File

@@ -154,10 +154,10 @@ class MetacafeIE(InfoExtractor):
# Extract URL, uploader and title from webpage
self.report_extraction(video_id)
video_url = None
mobj = re.search(r'(?m)&mediaURL=([^&]+)', webpage)
mobj = re.search(r'(?m)&(?:media|video)URL=([^&]+)', webpage)
if mobj is not None:
mediaURL = compat_urllib_parse_unquote(mobj.group(1))
video_ext = mediaURL[-3:]
video_ext = determine_ext(mediaURL)
# Extract gdaKey if available
mobj = re.search(r'(?m)&gdaKey=(.*?)&', webpage)
@@ -229,7 +229,7 @@ class MetacafeIE(InfoExtractor):
age_limit = (
18
if re.search(r'"contentRating":"restricted"', webpage)
if re.search(r'(?:"contentRating":|"rating",)"restricted"', webpage)
else 0)
if isinstance(video_url, list):

View File

@@ -64,7 +64,8 @@ class MixcloudIE(InfoExtractor):
preview_url = self._search_regex(
r'\s(?:data-preview-url|m-preview)="([^"]+)"', webpage, 'preview url')
song_url = preview_url.replace('/previews/', '/c/originals/')
song_url = re.sub(r'audiocdn(\d+)', r'stream\1', preview_url)
song_url = song_url.replace('/previews/', '/c/originals/')
if not self._check_url(song_url, track_id, 'mp3'):
song_url = song_url.replace('.mp3', '.m4a').replace('originals/', 'm4a/64/')
if not self._check_url(song_url, track_id, 'm4a'):

View File

@@ -1,27 +0,0 @@
from __future__ import unicode_literals
from .novamov import NovaMovIE
class MovShareIE(NovaMovIE):
IE_NAME = 'movshare'
IE_DESC = 'MovShare'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'movshare\.(?:net|sx|ag)'}
_HOST = 'www.movshare.net'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<strong>Title:</strong> ([^<]+)</p>'
_DESCRIPTION_REGEX = r'<strong>Description:</strong> ([^<]+)</p>'
_TEST = {
'url': 'http://www.movshare.net/video/559e28be54d96',
'md5': 'abd31a2132947262c50429e1d16c1bfd',
'info_dict': {
'id': '559e28be54d96',
'ext': 'flv',
'title': 'dissapeared image',
'description': 'optical illusion dissapeared image magic illusion',
}
}

View File

@@ -1,63 +1,102 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
remove_end,
parse_duration,
int_or_none,
xpath_text,
xpath_attr,
)
class NBAIE(InfoExtractor):
_VALID_URL = r'https?://(?:watch\.|www\.)?nba\.com/(?:nba/)?video(?P<id>/[^?]*?)/?(?:/index\.html)?(?:\?.*)?$'
_VALID_URL = r'https?://(?:watch\.|www\.)?nba\.com/(?P<path>(?:[^/]+/)?video/(?P<id>[^?]*?))/?(?:/index\.html)?(?:\?.*)?$'
_TESTS = [{
'url': 'http://www.nba.com/video/games/nets/2012/12/04/0021200253-okc-bkn-recap.nba/index.html',
'md5': 'c0edcfc37607344e2ff8f13c378c88a4',
'md5': '9e7729d3010a9c71506fd1248f74e4f4',
'info_dict': {
'id': '0021200253-okc-bkn-recap.nba',
'ext': 'mp4',
'id': '0021200253-okc-bkn-recap',
'ext': 'flv',
'title': 'Thunder vs. Nets',
'description': 'Kevin Durant scores 32 points and dishes out six assists as the Thunder beat the Nets in Brooklyn.',
'duration': 181,
'timestamp': 1354638466,
'upload_date': '20121204',
},
}, {
'url': 'http://www.nba.com/video/games/hornets/2014/12/05/0021400276-nyk-cha-play5.nba/',
'only_matching': True,
}, {
'url': 'http://watch.nba.com/nba/video/channels/playoffs/2015/05/20/0041400301-cle-atl-recap.nba',
'url': 'http://watch.nba.com/video/channels/playoffs/2015/05/20/0041400301-cle-atl-recap.nba',
'md5': 'b2b39b81cf28615ae0c3360a3f9668c4',
'info_dict': {
'id': '0041400301-cle-atl-recap.nba',
'id': '0041400301-cle-atl-recap',
'ext': 'mp4',
'title': 'NBA GAME TIME | Video: Hawks vs. Cavaliers Game 1',
'title': 'Hawks vs. Cavaliers Game 1',
'description': 'md5:8094c3498d35a9bd6b1a8c396a071b4d',
'duration': 228,
},
'params': {
'skip_download': True,
'timestamp': 1432134543,
'upload_date': '20150520',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
path, video_id = re.match(self._VALID_URL, url).groups()
if path.startswith('nba/'):
path = path[3:]
video_info = self._download_xml('http://www.nba.com/%s.xml' % path, video_id)
video_id = xpath_text(video_info, 'slug')
title = xpath_text(video_info, 'headline')
description = xpath_text(video_info, 'description')
duration = parse_duration(xpath_text(video_info, 'length'))
timestamp = int_or_none(xpath_attr(video_info, 'dateCreated', 'uts'))
video_url = 'http://ht-mobile.cdn.turner.com/nba/big' + video_id + '_nba_1280x720.mp4'
thumbnails = []
for image in video_info.find('images'):
thumbnails.append({
'id': image.attrib.get('cut'),
'url': image.text,
'width': int_or_none(image.attrib.get('width')),
'height': int_or_none(image.attrib.get('height')),
})
shortened_video_id = video_id.rpartition('/')[2]
title = remove_end(
self._og_search_title(webpage, default=shortened_video_id), ' : NBA.com')
description = self._og_search_description(webpage)
duration_str = self._html_search_meta(
'duration', webpage, 'duration', default=None)
if not duration_str:
duration_str = self._html_search_regex(
r'Duration:</b>\s*(\d+:\d+)', webpage, 'duration', fatal=False)
duration = parse_duration(duration_str)
formats = []
for video_file in video_info.findall('.//file'):
video_url = video_file.text
if video_url.startswith('/'):
continue
if video_url.endswith('.m3u8'):
m3u8_formats = self._extract_m3u8_formats(video_url, video_id, m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
elif video_url.endswith('.f4m'):
f4m_formats = self._extract_f4m_formats(video_url + '?hdcore=3.4.1.1', video_id, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
else:
key = video_file.attrib.get('bitrate')
format_info = {
'format_id': key,
'url': video_url,
}
mobj = re.search(r'(\d+)x(\d+)(?:_(\d+))?', key)
if mobj:
format_info.update({
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
'tbr': int_or_none(mobj.group(3)),
})
formats.append(format_info)
self._sort_formats(formats)
return {
'id': shortened_video_id,
'url': video_url,
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'timestamp': timestamp,
'thumbnails': thumbnails,
'formats': formats,
}

View File

@@ -11,6 +11,7 @@ from ..utils import (
ExtractorError,
find_xpath_attr,
lowercase_escape,
smuggle_url,
unescapeHTML,
)
@@ -62,12 +63,13 @@ class NBCIE(InfoExtractor):
theplatform_url = unescapeHTML(lowercase_escape(self._html_search_regex(
[
r'(?:class="video-player video-player-full" data-mpx-url|class="player" src)="(.*?)"',
r'<iframe[^>]+src="((?:https?:)?//player\.theplatform\.com/[^"]+)"',
r'"embedURL"\s*:\s*"([^"]+)"'
],
webpage, 'theplatform url').replace('_no_endcard', '').replace('\\/', '/')))
if theplatform_url.startswith('//'):
theplatform_url = 'http:' + theplatform_url
return self.url_result(theplatform_url)
return self.url_result(smuggle_url(theplatform_url, {'source_url': url}))
class NBCSportsVPlayerIE(InfoExtractor):

View File

@@ -23,9 +23,10 @@ class NovaMovIE(InfoExtractor):
_HOST = 'www.novamov.com'
_FILE_DELETED_REGEX = r'This file no longer exists on our servers!</h2>'
_FILEKEY_REGEX = r'flashvars\.filekey="(?P<filekey>[^"]+)";'
_FILEKEY_REGEX = r'flashvars\.filekey=(?P<filekey>"?[^"]+"?);'
_TITLE_REGEX = r'(?s)<div class="v_tab blockborder rounded5" id="v_tab1">\s*<h3>([^<]+)</h3>'
_DESCRIPTION_REGEX = r'(?s)<div class="v_tab blockborder rounded5" id="v_tab1">\s*<h3>[^<]+</h3><p>([^<]+)</p>'
_URL_TEMPLATE = 'http://%s/video/%s'
_TEST = {
'url': 'http://www.novamov.com/video/4rurhn9x446jj',
@@ -39,20 +40,28 @@ class NovaMovIE(InfoExtractor):
'skip': '"Invalid token" errors abound (in web interface as well as youtube-dl, there is nothing we can do about it.)'
}
def _check_existence(self, webpage, video_id):
if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
def _real_extract(self, url):
video_id = self._match_id(url)
url = 'http://%s/video/%s' % (self._HOST, video_id)
url = self._URL_TEMPLATE % (self._HOST, video_id)
webpage = self._download_webpage(
url, video_id, 'Downloading video page')
if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
self._check_existence(webpage, video_id)
def extract_filekey(default=NO_DEFAULT):
return self._search_regex(
filekey = self._search_regex(
self._FILEKEY_REGEX, webpage, 'filekey', default=default)
if filekey is not default and (filekey[0] != '"' or filekey[-1] != '"'):
return self._search_regex(
r'var\s*%s\s*=\s*"([^"]+)"', webpage, 'filekey', default=default)
else:
return filekey
filekey = extract_filekey(default=None)
@@ -69,6 +78,7 @@ class NovaMovIE(InfoExtractor):
request.add_header('Referer', post_url)
webpage = self._download_webpage(
request, video_id, 'Downloading continue to the video page')
self._check_existence(webpage, video_id)
filekey = extract_filekey()
@@ -92,3 +102,90 @@ class NovaMovIE(InfoExtractor):
'title': title,
'description': description
}
class WholeCloudIE(NovaMovIE):
IE_NAME = 'wholecloud'
IE_DESC = 'WholeCloud'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': '(?:wholecloud\.net|movshare\.(?:net|sx|ag))'}
_HOST = 'www.wholecloud.net'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<strong>Title:</strong> ([^<]+)</p>'
_DESCRIPTION_REGEX = r'<strong>Description:</strong> ([^<]+)</p>'
_TEST = {
'url': 'http://www.wholecloud.net/video/559e28be54d96',
'md5': 'abd31a2132947262c50429e1d16c1bfd',
'info_dict': {
'id': '559e28be54d96',
'ext': 'flv',
'title': 'dissapeared image',
'description': 'optical illusion dissapeared image magic illusion',
}
}
class NowVideoIE(NovaMovIE):
IE_NAME = 'nowvideo'
IE_DESC = 'NowVideo'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'nowvideo\.(?:to|ch|ec|sx|eu|at|ag|co|li)'}
_HOST = 'www.nowvideo.to'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<h4>([^<]+)</h4>'
_DESCRIPTION_REGEX = r'</h4>\s*<p>([^<]+)</p>'
_TEST = {
'url': 'http://www.nowvideo.to/video/0mw0yow7b6dxa',
'md5': 'f8fbbc8add72bd95b7850c6a02fc8817',
'info_dict': {
'id': '0mw0yow7b6dxa',
'ext': 'flv',
'title': 'youtubedl test video _BaW_jenozKc.mp4',
'description': 'Description',
},
'skip': 'Video 0mw0yow7b6dxa does not exist',
}
class VideoWeedIE(NovaMovIE):
IE_NAME = 'videoweed'
IE_DESC = 'VideoWeed'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'videoweed\.(?:es|com)'}
_HOST = 'www.videoweed.es'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<h1 class="text_shadow">([^<]+)</h1>'
_URL_TEMPLATE = 'http://%s/file/%s'
_TEST = {
'url': 'http://www.videoweed.es/file/b42178afbea14',
'md5': 'abd31a2132947262c50429e1d16c1bfd',
'info_dict': {
'id': 'b42178afbea14',
'ext': 'flv',
'title': 'optical illusion dissapeared image magic illusion',
'description': ''
},
}
class CloudTimeIE(NovaMovIE):
IE_NAME = 'cloudtime'
IE_DESC = 'CloudTime'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'cloudtime\.to'}
_HOST = 'www.cloudtime.to'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<div[^>]+class=["\']video_det["\'][^>]*>\s*<strong>([^<]+)</strong>'
_TEST = None

View File

@@ -71,7 +71,7 @@ class NowTVBaseIE(InfoExtractor):
class NowTVIE(NowTVBaseIE):
_VALID_URL = r'https?://(?:www\.)?nowtv\.(?:de|at|ch)/(?:rtl|rtl2|rtlnitro|superrtl|ntv|vox)/(?P<show_id>[^/]+)/(?:list/[^/]+/)?(?P<id>[^/]+)/(?:player|preview)'
_VALID_URL = r'https?://(?:www\.)?nowtv\.(?:de|at|ch)/(?:rtl|rtl2|rtlnitro|superrtl|ntv|vox)/(?P<show_id>[^/]+)/(?:(?:list/[^/]+|jahr/\d{4}/\d{1,2})/)?(?P<id>[^/]+)/(?:player|preview)'
_TESTS = [{
# rtl
@@ -190,6 +190,9 @@ class NowTVIE(NowTVBaseIE):
}, {
'url': 'http://www.nowtv.de/rtl2/echtzeit/list/aktuell/schnelles-geld-am-ende-der-welt/player',
'only_matching': True,
}, {
'url': 'http://www.nowtv.de/rtl2/zuhause-im-glueck/jahr/2015/11/eine-erschuetternde-diagnose/player',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -1,28 +0,0 @@
from __future__ import unicode_literals
from .novamov import NovaMovIE
class NowVideoIE(NovaMovIE):
IE_NAME = 'nowvideo'
IE_DESC = 'NowVideo'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'nowvideo\.(?:to|ch|ec|sx|eu|at|ag|co|li)'}
_HOST = 'www.nowvideo.to'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_FILEKEY_REGEX = r'var fkzd="([^"]+)";'
_TITLE_REGEX = r'<h4>([^<]+)</h4>'
_DESCRIPTION_REGEX = r'</h4>\s*<p>([^<]+)</p>'
_TEST = {
'url': 'http://www.nowvideo.ch/video/0mw0yow7b6dxa',
'md5': 'f8fbbc8add72bd95b7850c6a02fc8817',
'info_dict': {
'id': '0mw0yow7b6dxa',
'ext': 'flv',
'title': 'youtubedl test video _BaW_jenozKc.mp4',
'description': 'Description',
}
}

View File

@@ -6,6 +6,7 @@ import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
parse_duration,
@@ -48,12 +49,22 @@ class NRKIE(InfoExtractor):
'http://v8.psapi.nrk.no/mediaelement/%s' % video_id,
video_id, 'Downloading media JSON')
if data['usageRights']['isGeoBlocked']:
raise ExtractorError(
'NRK har ikke rettigheter til å vise dette programmet utenfor Norge',
expected=True)
media_url = data.get('mediaUrl')
video_url = data['mediaUrl'] + '?hdcore=3.5.0&plugin=aasp-3.5.0.151.81'
if not media_url:
if data['usageRights']['isGeoBlocked']:
raise ExtractorError(
'NRK har ikke rettigheter til å vise dette programmet utenfor Norge',
expected=True)
if determine_ext(media_url) == 'f4m':
formats = self._extract_f4m_formats(
media_url + '?hdcore=3.5.0&plugin=aasp-3.5.0.151.81', video_id, f4m_id='hds')
else:
formats = [{
'url': media_url,
'ext': 'flv',
}]
duration = parse_duration(data.get('duration'))
@@ -67,12 +78,11 @@ class NRKIE(InfoExtractor):
return {
'id': video_id,
'url': video_url,
'ext': 'flv',
'title': data['title'],
'description': data['description'],
'duration': duration,
'thumbnail': thumbnail,
'formats': formats,
}

View File

@@ -1,108 +1,78 @@
from __future__ import unicode_literals
import re
import json
import base64
from .common import InfoExtractor
from ..utils import (
unescapeHTML,
ExtractorError,
determine_ext,
int_or_none,
float_or_none,
ExtractorError,
unsmuggle_url,
)
from ..compat import compat_urllib_parse
class OoyalaBaseIE(InfoExtractor):
def _extract_result(self, info, more_info):
embedCode = info['embedCode']
video_url = info.get('ipad_url') or info['url']
if determine_ext(video_url) == 'm3u8':
formats = self._extract_m3u8_formats(video_url, embedCode, ext='mp4')
else:
formats = [{
'url': video_url,
'ext': 'mp4',
}]
return {
'id': embedCode,
'title': unescapeHTML(info['title']),
'formats': formats,
'description': unescapeHTML(more_info['description']),
'thumbnail': more_info['promo'],
def _extract(self, content_tree_url, video_id, domain='example.org'):
content_tree = self._download_json(content_tree_url, video_id)['content_tree']
metadata = content_tree[list(content_tree)[0]]
embed_code = metadata['embed_code']
pcode = metadata.get('asset_pcode') or embed_code
video_info = {
'id': embed_code,
'title': metadata['title'],
'description': metadata.get('description'),
'thumbnail': metadata.get('thumbnail_image') or metadata.get('promo_image'),
'duration': float_or_none(metadata.get('duration'), 1000),
}
def _extract(self, player_url, video_id):
player = self._download_webpage(player_url, video_id)
mobile_url = self._search_regex(r'mobile_player_url="(.+?)&device="',
player, 'mobile player url')
# Looks like some videos are only available for particular devices
# (e.g. http://player.ooyala.com/player.js?embedCode=x1b3lqZDq9y_7kMyC2Op5qo-p077tXD0
# is only available for ipad)
# Working around with fetching URLs for all the devices found starting with 'unknown'
# until we succeed or eventually fail for each device.
devices = re.findall(r'device\s*=\s*"([^"]+)";', player)
devices.remove('unknown')
devices.insert(0, 'unknown')
for device in devices:
mobile_player = self._download_webpage(
'%s&device=%s' % (mobile_url, device), video_id,
'Downloading mobile player JS for %s device' % device)
videos_info = self._search_regex(
r'var streams=window.oo_testEnv\?\[\]:eval\("\((\[{.*?}\])\)"\);',
mobile_player, 'info', fatal=False, default=None)
if videos_info:
break
if not videos_info:
formats = []
urls = []
formats = []
for supported_format in ('mp4', 'm3u8', 'hds', 'rtmp'):
auth_data = self._download_json(
'http://player.ooyala.com/sas/player_api/v1/authorization/embed_code/%s/%s?domain=www.example.org&supportedFormats=mp4,webm' % (video_id, video_id),
video_id)
'http://player.ooyala.com/sas/player_api/v1/authorization/embed_code/%s/%s?' % (pcode, embed_code) + compat_urllib_parse.urlencode({'domain': domain, 'supportedFormats': supported_format}),
video_id, 'Downloading %s JSON' % supported_format)
cur_auth_data = auth_data['authorization_data'][video_id]
cur_auth_data = auth_data['authorization_data'][embed_code]
for stream in cur_auth_data['streams']:
formats.append({
'url': base64.b64decode(stream['url']['data'].encode('ascii')).decode('utf-8'),
'ext': stream.get('delivery_type'),
'format': stream.get('video_codec'),
'format_id': stream.get('profile'),
'width': int_or_none(stream.get('width')),
'height': int_or_none(stream.get('height')),
'abr': int_or_none(stream.get('audio_bitrate')),
'vbr': int_or_none(stream.get('video_bitrate')),
})
if formats:
return {
'id': video_id,
'formats': formats,
'title': 'Ooyala video',
}
if cur_auth_data['authorized']:
for stream in cur_auth_data['streams']:
url = base64.b64decode(stream['url']['data'].encode('ascii')).decode('utf-8')
if url in urls:
continue
urls.append(url)
delivery_type = stream['delivery_type']
if delivery_type == 'hls' or '.m3u8' in url:
m3u8_formats = self._extract_m3u8_formats(url, embed_code, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
elif delivery_type == 'hds' or '.f4m' in url:
f4m_formats = self._extract_f4m_formats(url, embed_code, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
elif '.smil' in url:
smil_formats = self._extract_smil_formats(url, embed_code, fatal=False)
if smil_formats:
formats.extend(smil_formats)
else:
formats.append({
'url': url,
'ext': stream.get('delivery_type'),
'vcodec': stream.get('video_codec'),
'format_id': delivery_type,
'width': int_or_none(stream.get('width')),
'height': int_or_none(stream.get('height')),
'abr': int_or_none(stream.get('audio_bitrate')),
'vbr': int_or_none(stream.get('video_bitrate')),
'fps': float_or_none(stream.get('framerate')),
})
else:
raise ExtractorError('%s said: %s' % (self.IE_NAME, cur_auth_data['message']), expected=True)
self._sort_formats(formats)
if not cur_auth_data['authorized']:
raise ExtractorError(cur_auth_data['message'], expected=True)
if not videos_info:
raise ExtractorError('Unable to extract info')
videos_info = videos_info.replace('\\"', '"')
videos_more_info = self._search_regex(
r'eval\("\(({.*?\\"promo\\".*?})\)"', mobile_player, 'more info').replace('\\"', '"')
videos_info = json.loads(videos_info)
videos_more_info = json.loads(videos_more_info)
if videos_more_info.get('lineup'):
videos = [self._extract_result(info, more_info) for (info, more_info) in zip(videos_info, videos_more_info['lineup'])]
return {
'_type': 'playlist',
'id': video_id,
'title': unescapeHTML(videos_more_info['title']),
'entries': videos,
}
else:
return self._extract_result(videos_info[0], videos_more_info)
video_info['formats'] = formats
return video_info
class OoyalaIE(OoyalaBaseIE):
@@ -117,6 +87,7 @@ class OoyalaIE(OoyalaBaseIE):
'ext': 'mp4',
'title': 'Explaining Data Recovery from Hard Drives and SSDs',
'description': 'How badly damaged does a drive have to be to defeat Russell and his crew? Apparently, smashed to bits.',
'duration': 853.386,
},
}, {
# Only available for ipad
@@ -125,7 +96,7 @@ class OoyalaIE(OoyalaBaseIE):
'id': 'x1b3lqZDq9y_7kMyC2Op5qo-p077tXD0',
'ext': 'mp4',
'title': 'Simulation Overview - Levels of Simulation',
'description': '',
'duration': 194.948,
},
},
{
@@ -136,7 +107,8 @@ class OoyalaIE(OoyalaBaseIE):
'info_dict': {
'id': 'FiOG81ZTrvckcchQxmalf4aQj590qTEx',
'ext': 'mp4',
'title': 'Ooyala video',
'title': 'Divide Tool Path.mp4',
'duration': 204.405,
}
}
]
@@ -151,9 +123,11 @@ class OoyalaIE(OoyalaBaseIE):
ie=cls.ie_key())
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
embed_code = self._match_id(url)
player_url = 'http://player.ooyala.com/player.js?embedCode=%s' % embed_code
return self._extract(player_url, embed_code)
domain = smuggled_data.get('domain')
content_tree_url = 'http://player.ooyala.com/player_api/v1/content_tree/embed_code/%s/%s' % (embed_code, embed_code)
return self._extract(content_tree_url, embed_code, domain)
class OoyalaExternalIE(OoyalaBaseIE):
@@ -170,7 +144,7 @@ class OoyalaExternalIE(OoyalaBaseIE):
.*?&pcode=
)
(?P<pcode>.+?)
(&|$)
(?:&|$)
'''
_TEST = {
@@ -179,7 +153,7 @@ class OoyalaExternalIE(OoyalaBaseIE):
'id': 'FkYWtmazr6Ed8xmvILvKLWjd4QvYZpzG',
'ext': 'mp4',
'title': 'dm_140128_30for30Shorts___JudgingJewellv2',
'description': '',
'duration': 1302000,
},
'params': {
# m3u8 download
@@ -188,9 +162,6 @@ class OoyalaExternalIE(OoyalaBaseIE):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
partner_id = mobj.group('partner_id')
video_id = mobj.group('id')
pcode = mobj.group('pcode')
player_url = 'http://player.ooyala.com/player.js?externalId=%s:%s&pcode=%s' % (partner_id, video_id, pcode)
return self._extract(player_url, video_id)
partner_id, video_id, pcode = re.match(self._VALID_URL, url).groups()
content_tree_url = 'http://player.ooyala.com/player_api/v1/content_tree/external_id/%s/%s:%s' % (pcode, partner_id, video_id)
return self._extract(content_tree_url, video_id)

View File

@@ -15,16 +15,181 @@ from ..utils import (
class PBSIE(InfoExtractor):
_STATIONS = (
('video.pbs.org', 'PBS: Public Broadcasting Service'), # http://www.pbs.org/
('video.aptv.org', 'APT - Alabama Public Television (WBIQ)'), # http://aptv.org/
('video.gpb.org', 'GPB/Georgia Public Broadcasting (WGTV)'), # http://www.gpb.org/
('video.mpbonline.org', 'Mississippi Public Broadcasting (WMPN)'), # http://www.mpbonline.org
('video.wnpt.org', 'Nashville Public Television (WNPT)'), # http://www.wnpt.org
('video.wfsu.org', 'WFSU-TV (WFSU)'), # http://wfsu.org/
('video.wsre.org', 'WSRE (WSRE)'), # http://www.wsre.org
('video.wtcitv.org', 'WTCI (WTCI)'), # http://www.wtcitv.org
('video.pba.org', 'WPBA/Channel 30 (WPBA)'), # http://pba.org/
('video.alaskapublic.org', 'Alaska Public Media (KAKM)'), # http://alaskapublic.org/kakm
# ('kuac.org', 'KUAC (KUAC)'), # http://kuac.org/kuac-tv/
# ('ktoo.org', '360 North (KTOO)'), # http://www.ktoo.org/
# ('azpm.org', 'KUAT 6 (KUAT)'), # http://www.azpm.org/
('video.azpbs.org', 'Arizona PBS (KAET)'), # http://www.azpbs.org
('portal.knme.org', 'KNME-TV/Channel 5 (KNME)'), # http://www.newmexicopbs.org/
('video.vegaspbs.org', 'Vegas PBS (KLVX)'), # http://vegaspbs.org/
('watch.aetn.org', 'AETN/ARKANSAS ETV NETWORK (KETS)'), # http://www.aetn.org/
('video.ket.org', 'KET (WKLE)'), # http://www.ket.org/
('video.wkno.org', 'WKNO/Channel 10 (WKNO)'), # http://www.wkno.org/
('video.lpb.org', 'LPB/LOUISIANA PUBLIC BROADCASTING (WLPB)'), # http://www.lpb.org/
('videos.oeta.tv', 'OETA (KETA)'), # http://www.oeta.tv
('video.optv.org', 'Ozarks Public Television (KOZK)'), # http://www.optv.org/
('watch.wsiu.org', 'WSIU Public Broadcasting (WSIU)'), # http://www.wsiu.org/
('video.keet.org', 'KEET TV (KEET)'), # http://www.keet.org
('pbs.kixe.org', 'KIXE/Channel 9 (KIXE)'), # http://kixe.org/
('video.kpbs.org', 'KPBS San Diego (KPBS)'), # http://www.kpbs.org/
('video.kqed.org', 'KQED (KQED)'), # http://www.kqed.org
('vids.kvie.org', 'KVIE Public Television (KVIE)'), # http://www.kvie.org
('video.pbssocal.org', 'PBS SoCal/KOCE (KOCE)'), # http://www.pbssocal.org/
('video.valleypbs.org', 'ValleyPBS (KVPT)'), # http://www.valleypbs.org/
('video.cptv.org', 'CONNECTICUT PUBLIC TELEVISION (WEDH)'), # http://cptv.org
('watch.knpb.org', 'KNPB Channel 5 (KNPB)'), # http://www.knpb.org/
('video.soptv.org', 'SOPTV (KSYS)'), # http://www.soptv.org
# ('klcs.org', 'KLCS/Channel 58 (KLCS)'), # http://www.klcs.org
# ('krcb.org', 'KRCB Television & Radio (KRCB)'), # http://www.krcb.org
# ('kvcr.org', 'KVCR TV/DT/FM :: Vision for the Future (KVCR)'), # http://kvcr.org
('video.rmpbs.org', 'Rocky Mountain PBS (KRMA)'), # http://www.rmpbs.org
('video.kenw.org', 'KENW-TV3 (KENW)'), # http://www.kenw.org
('video.kued.org', 'KUED Channel 7 (KUED)'), # http://www.kued.org
('video.wyomingpbs.org', 'Wyoming PBS (KCWC)'), # http://www.wyomingpbs.org
('video.cpt12.org', 'Colorado Public Television / KBDI 12 (KBDI)'), # http://www.cpt12.org/
('video.kbyueleven.org', 'KBYU-TV (KBYU)'), # http://www.kbyutv.org/
('video.thirteen.org', 'Thirteen/WNET New York (WNET)'), # http://www.thirteen.org
('video.wgbh.org', 'WGBH/Channel 2 (WGBH)'), # http://wgbh.org
('video.wgby.org', 'WGBY (WGBY)'), # http://www.wgby.org
('watch.njtvonline.org', 'NJTV Public Media NJ (WNJT)'), # http://www.njtvonline.org/
# ('ripbs.org', 'Rhode Island PBS (WSBE)'), # http://www.ripbs.org/home/
('watch.wliw.org', 'WLIW21 (WLIW)'), # http://www.wliw.org/
('video.mpt.tv', 'mpt/Maryland Public Television (WMPB)'), # http://www.mpt.org
('watch.weta.org', 'WETA Television and Radio (WETA)'), # http://www.weta.org
('video.whyy.org', 'WHYY (WHYY)'), # http://www.whyy.org
('video.wlvt.org', 'PBS 39 (WLVT)'), # http://www.wlvt.org/
('video.wvpt.net', 'WVPT - Your Source for PBS and More! (WVPT)'), # http://www.wvpt.net
('video.whut.org', 'Howard University Television (WHUT)'), # http://www.whut.org
('video.wedu.org', 'WEDU PBS (WEDU)'), # http://www.wedu.org
('video.wgcu.org', 'WGCU Public Media (WGCU)'), # http://www.wgcu.org/
# ('wjct.org', 'WJCT Public Broadcasting (WJCT)'), # http://www.wjct.org
('video.wpbt2.org', 'WPBT2 (WPBT)'), # http://www.wpbt2.org
('video.wucftv.org', 'WUCF TV (WUCF)'), # http://wucftv.org
('video.wuft.org', 'WUFT/Channel 5 (WUFT)'), # http://www.wuft.org
('watch.wxel.org', 'WXEL/Channel 42 (WXEL)'), # http://www.wxel.org/home/
('video.wlrn.org', 'WLRN/Channel 17 (WLRN)'), # http://www.wlrn.org/
('video.wusf.usf.edu', 'WUSF Public Broadcasting (WUSF)'), # http://wusf.org/
('video.scetv.org', 'ETV (WRLK)'), # http://www.scetv.org
('video.unctv.org', 'UNC-TV (WUNC)'), # http://www.unctv.org/
# ('pbsguam.org', 'PBS Guam (KGTF)'), # http://www.pbsguam.org/
('video.pbshawaii.org', 'PBS Hawaii - Oceanic Cable Channel 10 (KHET)'), # http://www.pbshawaii.org/
('video.idahoptv.org', 'Idaho Public Television (KAID)'), # http://idahoptv.org
('video.ksps.org', 'KSPS (KSPS)'), # http://www.ksps.org/home/
('watch.opb.org', 'OPB (KOPB)'), # http://www.opb.org
('watch.nwptv.org', 'KWSU/Channel 10 & KTNW/Channel 31 (KWSU)'), # http://www.kwsu.org
('video.will.illinois.edu', 'WILL-TV (WILL)'), # http://will.illinois.edu/
('video.networkknowledge.tv', 'Network Knowledge - WSEC/Springfield (WSEC)'), # http://www.wsec.tv
('video.wttw.com', 'WTTW11 (WTTW)'), # http://www.wttw.com/
# ('wtvp.org', 'WTVP & WTVP.org, Public Media for Central Illinois (WTVP)'), # http://www.wtvp.org/
('video.iptv.org', 'Iowa Public Television/IPTV (KDIN)'), # http://www.iptv.org/
('video.ninenet.org', 'Nine Network (KETC)'), # http://www.ninenet.org
('video.wfwa.org', 'PBS39 Fort Wayne (WFWA)'), # http://wfwa.org/
('video.wfyi.org', 'WFYI Indianapolis (WFYI)'), # http://www.wfyi.org
('video.mptv.org', 'Milwaukee Public Television (WMVS)'), # http://www.mptv.org
('video.wnin.org', 'WNIN (WNIN)'), # http://www.wnin.org/
('video.wnit.org', 'WNIT Public Television (WNIT)'), # http://www.wnit.org/
('video.wpt.org', 'WPT (WPNE)'), # http://www.wpt.org/
('video.wvut.org', 'WVUT/Channel 22 (WVUT)'), # http://wvut.org/
('video.weiu.net', 'WEIU/Channel 51 (WEIU)'), # http://www.weiu.net
('video.wqpt.org', 'WQPT-TV (WQPT)'), # http://www.wqpt.org
('video.wycc.org', 'WYCC PBS Chicago (WYCC)'), # http://www.wycc.org
# ('lakeshorepublicmedia.org', 'Lakeshore Public Television (WYIN)'), # http://lakeshorepublicmedia.org/
('video.wipb.org', 'WIPB-TV (WIPB)'), # http://wipb.org
('video.indianapublicmedia.org', 'WTIU (WTIU)'), # http://indianapublicmedia.org/tv/
('watch.cetconnect.org', 'CET (WCET)'), # http://www.cetconnect.org
('video.thinktv.org', 'ThinkTVNetwork (WPTD)'), # http://www.thinktv.org
('video.wbgu.org', 'WBGU-TV (WBGU)'), # http://wbgu.org
('video.wgvu.org', 'WGVU TV (WGVU)'), # http://www.wgvu.org/
('video.netnebraska.org', 'NET1 (KUON)'), # http://netnebraska.org
('video.pioneer.org', 'Pioneer Public Television (KWCM)'), # http://www.pioneer.org
('watch.sdpb.org', 'SDPB Television (KUSD)'), # http://www.sdpb.org
('video.tpt.org', 'TPT (KTCA)'), # http://www.tpt.org
('watch.ksmq.org', 'KSMQ (KSMQ)'), # http://www.ksmq.org/
('watch.kpts.org', 'KPTS/Channel 8 (KPTS)'), # http://www.kpts.org/
('watch.ktwu.org', 'KTWU/Channel 11 (KTWU)'), # http://ktwu.org
# ('shptv.org', 'Smoky Hills Public Television (KOOD)'), # http://www.shptv.org
# ('kcpt.org', 'KCPT Kansas City Public Television (KCPT)'), # http://kcpt.org/
# ('blueridgepbs.org', 'Blue Ridge PBS (WBRA)'), # http://www.blueridgepbs.org/
('watch.easttennesseepbs.org', 'East Tennessee PBS (WSJK)'), # http://easttennesseepbs.org
('video.wcte.tv', 'WCTE-TV (WCTE)'), # http://www.wcte.org
('video.wljt.org', 'WLJT, Channel 11 (WLJT)'), # http://wljt.org/
('video.wosu.org', 'WOSU TV (WOSU)'), # http://wosu.org/
('video.woub.org', 'WOUB/WOUC (WOUB)'), # http://woub.org/tv/index.php?section=5
('video.wvpublic.org', 'WVPB (WVPB)'), # http://wvpublic.org/
('video.wkyupbs.org', 'WKYU-PBS (WKYU)'), # http://www.wkyupbs.org
# ('wyes.org', 'WYES-TV/New Orleans (WYES)'), # http://www.wyes.org
('video.kera.org', 'KERA 13 (KERA)'), # http://www.kera.org/
('video.mpbn.net', 'MPBN (WCBB)'), # http://www.mpbn.net/
('video.mountainlake.org', 'Mountain Lake PBS (WCFE)'), # http://www.mountainlake.org/
('video.nhptv.org', 'NHPTV (WENH)'), # http://nhptv.org/
('video.vpt.org', 'Vermont PBS (WETK)'), # http://www.vpt.org
('video.witf.org', 'witf (WITF)'), # http://www.witf.org
('watch.wqed.org', 'WQED Multimedia (WQED)'), # http://www.wqed.org/
('video.wmht.org', 'WMHT Educational Telecommunications (WMHT)'), # http://www.wmht.org/home/
('video.deltabroadcasting.org', 'Q-TV (WDCQ)'), # http://www.deltabroadcasting.org
('video.dptv.org', 'WTVS Detroit Public TV (WTVS)'), # http://www.dptv.org/
('video.wcmu.org', 'CMU Public Television (WCMU)'), # http://www.wcmu.org
('video.wkar.org', 'WKAR-TV (WKAR)'), # http://wkar.org/
('wnmuvideo.nmu.edu', 'WNMU-TV Public TV 13 (WNMU)'), # http://wnmutv.nmu.edu
('video.wdse.org', 'WDSE - WRPT (WDSE)'), # http://www.wdse.org/
('video.wgte.org', 'WGTE TV (WGTE)'), # http://www.wgte.org
('video.lptv.org', 'Lakeland Public Television (KAWE)'), # http://www.lakelandptv.org
# ('prairiepublic.org', 'PRAIRIE PUBLIC (KFME)'), # http://www.prairiepublic.org/
('video.kmos.org', 'KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS)'), # http://www.kmos.org/
('watch.montanapbs.org', 'MontanaPBS (KUSM)'), # http://montanapbs.org
('video.krwg.org', 'KRWG/Channel 22 (KRWG)'), # http://www.krwg.org
('video.kacvtv.org', 'KACV (KACV)'), # http://www.panhandlepbs.org/home/
('video.kcostv.org', 'KCOS/Channel 13 (KCOS)'), # www.kcostv.org
('video.wcny.org', 'WCNY/Channel 24 (WCNY)'), # http://www.wcny.org
('video.wned.org', 'WNED (WNED)'), # http://www.wned.org/
('watch.wpbstv.org', 'WPBS (WPBS)'), # http://www.wpbstv.org
('video.wskg.org', 'WSKG Public TV (WSKG)'), # http://wskg.org
('video.wxxi.org', 'WXXI (WXXI)'), # http://wxxi.org
('video.wpsu.org', 'WPSU (WPSU)'), # http://www.wpsu.org
# ('wqln.org', 'WQLN/Channel 54 (WQLN)'), # http://www.wqln.org
('on-demand.wvia.org', 'WVIA Public Media Studios (WVIA)'), # http://www.wvia.org/
('video.wtvi.org', 'WTVI (WTVI)'), # http://www.wtvi.org/
# ('whro.org', 'WHRO (WHRO)'), # http://whro.org
('video.westernreservepublicmedia.org', 'Western Reserve PBS (WNEO)'), # http://www.WesternReservePublicMedia.org/
('video.ideastream.org', 'WVIZ/PBS ideastream (WVIZ)'), # http://www.wviz.org/
('video.kcts9.org', 'KCTS 9 (KCTS)'), # http://kcts9.org/
('video.basinpbs.org', 'Basin PBS (KPBT)'), # http://www.basinpbs.org
('video.houstonpbs.org', 'KUHT / Channel 8 (KUHT)'), # http://www.houstonpublicmedia.org/
# ('tamu.edu', 'KAMU - TV (KAMU)'), # http://KAMU.tamu.edu
# ('kedt.org', 'KEDT/Channel 16 (KEDT)'), # http://www.kedt.org
('video.klrn.org', 'KLRN (KLRN)'), # http://www.klrn.org
('video.klru.tv', 'KLRU (KLRU)'), # http://www.klru.org
# ('kmbh.org', 'KMBH-TV (KMBH)'), # http://www.kmbh.org
# ('knct.org', 'KNCT (KNCT)'), # http://www.knct.org
# ('ktxt.org', 'KTTZ-TV (KTXT)'), # http://www.ktxt.org
('video.wtjx.org', 'WTJX Channel 12 (WTJX)'), # http://www.wtjx.org/
('video.ideastations.org', 'WCVE PBS (WCVE)'), # http://ideastations.org/
('video.kbtc.org', 'KBTC Public Television (KBTC)'), # http://kbtc.org
)
IE_NAME = 'pbs'
IE_DESC = 'Public Broadcasting Service (PBS) and member stations: %s' % ', '.join(list(zip(*_STATIONS))[1])
_VALID_URL = r'''(?x)https?://
(?:
# Direct video URL
video\.pbs\.org/(?:viralplayer|video)/(?P<id>[0-9]+)/? |
(?:%s)/(?:viralplayer|video)/(?P<id>[0-9]+)/? |
# Article with embedded player (or direct video)
(?:www\.)?pbs\.org/(?:[^/]+/){2,5}(?P<presumptive_id>[^/]+?)(?:\.html)?/?(?:$|[?\#]) |
# Player
(?:video|player)\.pbs\.org/(?:widget/)?partnerplayer/(?P<player_id>[^/]+)/
)
'''
''' % '|'.join(re.escape(p) for p in list(zip(*_STATIONS))[0])
_TESTS = [
{
@@ -174,6 +339,10 @@ class PBSIE(InfoExtractor):
{
'url': 'http://player.pbs.org/widget/partnerplayer/2365297708/?start=0&end=0&chapterbar=false&endscreen=false&topbar=true',
'only_matching': True,
},
{
'url': 'http://watch.knpb.org/video/2365616055/',
'only_matching': True,
}
]
_ERRORS = {
@@ -204,6 +373,7 @@ class PBSIE(InfoExtractor):
MEDIA_ID_REGEXES = [
r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed
r'class="coveplayerid">([^<]+)<', # coveplayer
r'<section[^>]+data-coveid="(\d+)"', # coveplayer from http://www.pbs.org/wgbh/frontline/film/real-csi/
r'<input type="hidden" id="pbs_video_id_[0-9]+" value="([0-9]+)"/>', # jwplayer
]

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@@ -44,6 +46,13 @@ class PladformIE(InfoExtractor):
'only_matching': True,
}]
@staticmethod
def _extract_url(webpage):
mobj = re.search(
r'<iframe[^>]+src="(?P<url>(?:https?:)?//out\.pladform\.ru/player\?.+?)"', webpage)
if mobj:
return mobj.group('url')
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -1,5 +1,6 @@
from __future__ import unicode_literals
import re
import json
import random
import collections
@@ -14,6 +15,7 @@ from ..utils import (
ExtractorError,
int_or_none,
parse_duration,
qualities,
sanitized_Request,
)
@@ -140,15 +142,28 @@ class PluralsightIE(PluralsightBaseIE):
'low': {'width': 640, 'height': 480},
'medium': {'width': 848, 'height': 640},
'high': {'width': 1024, 'height': 768},
'high-widescreen': {'width': 1280, 'height': 720},
}
QUALITIES_PREFERENCE = ('low', 'medium', 'high', 'high-widescreen',)
quality_key = qualities(QUALITIES_PREFERENCE)
AllowedQuality = collections.namedtuple('AllowedQuality', ['ext', 'qualities'])
ALLOWED_QUALITIES = (
AllowedQuality('webm', ('high',)),
AllowedQuality('mp4', ('low', 'medium', 'high',)),
AllowedQuality('webm', ['high', ]),
AllowedQuality('mp4', ['low', 'medium', 'high', ]),
)
# Some courses also offer widescreen resolution for high quality (see
# https://github.com/rg3/youtube-dl/issues/7766)
widescreen = True if re.search(
r'courseSupportsWidescreenVideoFormats\s*:\s*true', webpage) else False
best_quality = 'high-widescreen' if widescreen else 'high'
if widescreen:
for allowed_quality in ALLOWED_QUALITIES:
allowed_quality.qualities.append(best_quality)
# In order to minimize the number of calls to ViewClip API and reduce
# the probability of being throttled or banned by Pluralsight we will request
# only single format until formats listing was explicitly requested.
@@ -157,19 +172,19 @@ class PluralsightIE(PluralsightBaseIE):
else:
def guess_allowed_qualities():
req_format = self._downloader.params.get('format') or 'best'
req_format_split = req_format.split('-')
req_format_split = req_format.split('-', 1)
if len(req_format_split) > 1:
req_ext, req_quality = req_format_split
for allowed_quality in ALLOWED_QUALITIES:
if req_ext == allowed_quality.ext and req_quality in allowed_quality.qualities:
return (AllowedQuality(req_ext, (req_quality, )), )
req_ext = 'webm' if self._downloader.params.get('prefer_free_formats') else 'mp4'
return (AllowedQuality(req_ext, ('high', )), )
return (AllowedQuality(req_ext, (best_quality, )), )
allowed_qualities = guess_allowed_qualities()
formats = []
for ext, qualities in allowed_qualities:
for quality in qualities:
for ext, qualities_ in allowed_qualities:
for quality in qualities_:
f = QUALITIES[quality].copy()
clip_post = {
'a': author,
@@ -205,6 +220,7 @@ class PluralsightIE(PluralsightBaseIE):
'url': clip_url,
'ext': ext,
'format_id': format_id,
'quality': quality_key(quality),
})
formats.append(f)
self._sort_formats(formats)

View File

@@ -147,7 +147,8 @@ class PornHubPlaylistIE(InfoExtractor):
entries = [
self.url_result('http://www.pornhub.com/%s' % video_url, 'PornHub')
for video_url in set(re.findall('href="/?(view_video\.php\?viewkey=\d+[^"]*)"', webpage))
for video_url in set(re.findall(
r'href="/?(view_video\.php\?.*\bviewkey=[\da-z]+[^"]*)"', webpage))
]
playlist = self._parse_json(

View File

@@ -6,12 +6,12 @@ import re
from .common import InfoExtractor
from .brightcove import BrightcoveLegacyIE
from ..compat import compat_urllib_parse
from ..utils import (
ExtractorError,
sanitized_Request,
smuggle_url,
std_headers,
urlencode_postdata,
)
@@ -57,7 +57,7 @@ class SafariBaseIE(InfoExtractor):
}
request = sanitized_Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form), headers=headers)
self._LOGIN_URL, urlencode_postdata(login_form), headers=headers)
login_page = self._download_webpage(
request, None, 'Logging in as %s' % username)

View File

@@ -0,0 +1,117 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
parse_iso8601,
parse_duration,
)
class SkyNewsArabiaBaseIE(InfoExtractor):
_IMAGE_BASE_URL = 'http://www.skynewsarabia.com/web/images'
def _call_api(self, path, value):
return self._download_json('http://api.skynewsarabia.com/web/rest/v2/%s/%s.json' % (path, value), value)
def _get_limelight_media_id(self, url):
return self._search_regex(r'/media/[^/]+/([a-z0-9]{32})', url, 'limelight media id')
def _get_image_url(self, image_path_template, width='1600', height='1200'):
return self._IMAGE_BASE_URL + image_path_template.format(width=width, height=height)
def _extract_video_info(self, video_data):
video_id = compat_str(video_data['id'])
topic = video_data.get('topicTitle')
return {
'_type': 'url_transparent',
'url': 'limelight:media:%s' % self._get_limelight_media_id(video_data['videoUrl'][0]['url']),
'id': video_id,
'title': video_data['headline'],
'description': video_data.get('summary'),
'thumbnail': self._get_image_url(video_data['mediaAsset']['imageUrl']),
'timestamp': parse_iso8601(video_data.get('date')),
'duration': parse_duration(video_data.get('runTime')),
'tags': video_data.get('tags', []),
'categories': [topic] if topic else [],
'webpage_url': 'http://www.skynewsarabia.com/web/video/%s' % video_id,
'ie_key': 'LimelightMedia',
}
class SkyNewsArabiaIE(SkyNewsArabiaBaseIE):
IE_NAME = 'skynewsarabia:video'
_VALID_URL = r'https?://(?:www\.)?skynewsarabia\.com/web/video/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.skynewsarabia.com/web/video/794902/%D9%86%D8%B5%D9%81-%D9%85%D9%84%D9%8A%D9%88%D9%86-%D9%85%D8%B5%D8%A8%D8%A7%D8%AD-%D8%B4%D8%AC%D8%B1%D8%A9-%D9%83%D8%B1%D9%8A%D8%B3%D9%85%D8%A7%D8%B3',
'info_dict': {
'id': '794902',
'ext': 'flv',
'title': 'نصف مليون مصباح على شجرة كريسماس',
'description': 'md5:22f1b27f0850eeb10c7e59b1f16eb7c6',
'upload_date': '20151128',
'timestamp': 1448697198,
'duration': 2119,
},
'params': {
# rtmp download
'skip_download': True,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._call_api('video', video_id)
return self._extract_video_info(video_data)
class SkyNewsArabiaArticleIE(SkyNewsArabiaBaseIE):
IE_NAME = 'skynewsarabia:video'
_VALID_URL = r'https?://(?:www\.)?skynewsarabia\.com/web/article/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.skynewsarabia.com/web/article/794549/%D8%A7%D9%94%D8%AD%D8%AF%D8%A7%D8%AB-%D8%A7%D9%84%D8%B4%D8%B1%D9%82-%D8%A7%D9%84%D8%A7%D9%94%D9%88%D8%B3%D8%B7-%D8%AE%D8%B1%D9%8A%D8%B7%D8%A9-%D8%A7%D9%84%D8%A7%D9%94%D9%84%D8%B9%D8%A7%D8%A8-%D8%A7%D9%84%D8%B0%D9%83%D9%8A%D8%A9',
'info_dict': {
'id': '794549',
'ext': 'flv',
'title': 'بالفيديو.. ألعاب ذكية تحاكي واقع المنطقة',
'description': 'md5:0c373d29919a851e080ee4edd0c5d97f',
'upload_date': '20151126',
'timestamp': 1448559336,
'duration': 281.6,
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
'url': 'http://www.skynewsarabia.com/web/article/794844/%D8%A7%D8%B3%D8%AA%D9%87%D8%AF%D8%A7%D9%81-%D9%82%D9%88%D8%A7%D8%B1%D8%A8-%D8%A7%D9%94%D8%B3%D9%84%D8%AD%D8%A9-%D9%84%D9%85%D9%8A%D9%84%D9%8A%D8%B4%D9%8A%D8%A7%D8%AA-%D8%A7%D9%84%D8%AD%D9%88%D8%AB%D9%8A-%D9%88%D8%B5%D8%A7%D9%84%D8%AD',
'info_dict': {
'id': '794844',
'title': 'إحباط تهريب أسلحة لميليشيات الحوثي وصالح بجنوب اليمن',
'description': 'md5:5c927b8b2e805796e7f693538d96fc7e',
},
'playlist_mincount': 2,
}]
def _real_extract(self, url):
article_id = self._match_id(url)
article_data = self._call_api('article', article_id)
media_asset = article_data['mediaAsset']
if media_asset['type'] == 'VIDEO':
topic = article_data.get('topicTitle')
return {
'_type': 'url_transparent',
'url': 'limelight:media:%s' % self._get_limelight_media_id(media_asset['videoUrl'][0]['url']),
'id': article_id,
'title': article_data['headline'],
'description': article_data.get('summary'),
'thumbnail': self._get_image_url(media_asset['imageUrl']),
'timestamp': parse_iso8601(article_data.get('date')),
'tags': article_data.get('tags', []),
'categories': [topic] if topic else [],
'webpage_url': url,
'ie_key': 'LimelightMedia',
}
entries = [self._extract_video_info(item) for item in article_data.get('inlineItems', []) if item['type'] == 'VIDEO']
return self.playlist_result(entries, article_id, article_data['headline'], article_data.get('summary'))

View File

@@ -158,6 +158,7 @@ class SohuIE(InfoExtractor):
'file': clips_url[i],
'new': su[i],
'prod': 'flash',
'rb': 1,
}
if cdnId is not None:

View File

@@ -58,7 +58,8 @@ class SpiegelIE(InfoExtractor):
description = self._html_search_meta('description', webpage, 'description')
base_url = self._search_regex(
r'var\s+server\s*=\s*"([^"]+)\"', webpage, 'server URL')
[r'server\s*:\s*(["\'])(?P<url>.+?)\1', r'var\s+server\s*=\s*"(?P<url>[^"]+)\"'],
webpage, 'server URL', group='url')
xml_url = base_url + video_id + '.xml'
idoc = self._download_xml(xml_url, video_id)

View File

@@ -11,7 +11,7 @@ from ..utils import (
class SrfIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.srf\.ch/play(?:er)?/tv/[^/]+/video/(?P<display_id>[^?]+)\?id=|tp\.srgssr\.ch/p/flash\?urn=urn:srf:ais:video:)(?P<id>[0-9a-f\-]{36})'
_VALID_URL = r'https?://(?:www\.srf\.ch/play(?:er)?/(?:tv|radio)/[^/]+/(?P<media_type>video|audio)/(?P<display_id>[^?]+)\?id=|tp\.srgssr\.ch/p/flash\?urn=urn:srf:ais:video:)(?P<id>[0-9a-f\-]{36})'
_TESTS = [{
'url': 'http://www.srf.ch/play/tv/10vor10/video/snowden-beantragt-asyl-in-russland?id=28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'md5': '4cd93523723beff51bb4bee974ee238d',
@@ -35,6 +35,20 @@ class SrfIE(InfoExtractor):
'title': 'Jaguar XK120, Shadow und Tornado-Dampflokomotive',
'timestamp': 1373493600,
},
}, {
'url': 'http://www.srf.ch/play/radio/hoerspielarchiv-srf-musikwelle/audio/saegel-ohni-wind-von-jakob-stebler?id=415bf3d3-6429-4de7-968d-95866e37cfbc',
'md5': '',
'info_dict': {
'id': '415bf3d3-6429-4de7-968d-95866e37cfbc',
'display_id': 'saegel-ohni-wind-von-jakob-stebler',
'ext': 'mp3',
'upload_date': '20080518',
'title': '«Sägel ohni Wind» von Jakob Stebler',
'timestamp': 1211112000,
},
'params': {
'skip_download': True, # requires rtmpdump
},
}, {
'url': 'http://www.srf.ch/player/tv/10vor10/video/snowden-beantragt-asyl-in-russland?id=28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'only_matching': True,
@@ -44,11 +58,13 @@ class SrfIE(InfoExtractor):
}]
def _real_extract(self, url):
video_id = self._match_id(url)
display_id = re.match(self._VALID_URL, url).group('display_id') or video_id
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
media_type = mobj.group('media_type')
display_id = mobj.group('display_id') or video_id
video_data = self._download_xml(
'http://il.srgssr.ch/integrationlayer/1.0/ue/srf/video/play/%s.xml' % video_id,
'http://il.srgssr.ch/integrationlayer/1.0/ue/srf/%s/play/%s.xml' % (media_type, video_id),
display_id)
title = xpath_text(
@@ -64,7 +80,7 @@ class SrfIE(InfoExtractor):
for url_node in item.findall('url'):
quality = url_node.attrib['quality']
full_url = url_node.text
original_ext = determine_ext(full_url)
original_ext = determine_ext(full_url).lower()
format_id = '%s-%s' % (quality, item.attrib['protocol'])
if original_ext == 'f4m':
formats.extend(self._extract_f4m_formats(

View File

@@ -16,6 +16,7 @@ class TeachingChannelIE(InfoExtractor):
'ext': 'mp4',
'title': 'A History of Teaming',
'description': 'md5:2a9033db8da81f2edffa4c99888140b3',
'duration': 422.255,
},
'params': {
# m3u8 download

View File

@@ -16,11 +16,12 @@ from ..compat import (
from ..utils import (
determine_ext,
ExtractorError,
xpath_with_ns,
unsmuggle_url,
int_or_none,
url_basename,
float_or_none,
int_or_none,
sanitized_Request,
unsmuggle_url,
url_basename,
xpath_with_ns,
)
default_ns = 'http://www.w3.org/2005/SMIL21/Language'
@@ -204,7 +205,12 @@ class ThePlatformIE(ThePlatformBaseIE):
smil_url = url
# Explicitly specified SMIL (see https://github.com/rg3/youtube-dl/issues/7385)
elif '/guid/' in url:
webpage = self._download_webpage(url, video_id)
headers = {}
source_url = smuggled_data.get('source_url')
if source_url:
headers['Referer'] = source_url
request = sanitized_Request(url, headers=headers)
webpage = self._download_webpage(request, video_id)
smil_url = self._search_regex(
r'<link[^>]+href=(["\'])(?P<url>.+?)\1[^>]+type=["\']application/smil\+xml',
webpage, 'smil url', group='url')

View File

@@ -1,80 +1,103 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import ExtractorError
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
)
class TriluliluIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?trilulilu\.ro/(?:video-[^/]+/)?(?P<id>[^/#\?]+)'
_TEST = {
'url': 'http://www.trilulilu.ro/video-animatie/big-buck-bunny-1',
'md5': 'c1450a00da251e2769b74b9005601cac',
_VALID_URL = r'https?://(?:(?:www|m)\.)?trilulilu\.ro/(?:[^/]+/)?(?P<id>[^/#\?]+)'
_TESTS = [{
'url': 'http://www.trilulilu.ro/big-buck-bunny-1',
'md5': '68da087b676a6196a413549212f60cc6',
'info_dict': {
'id': 'ae2899e124140b',
'ext': 'mp4',
'title': 'Big Buck Bunny',
'description': ':) pentru copilul din noi',
'uploader_id': 'chipy',
'upload_date': '20120304',
'timestamp': 1330830647,
'uploader': 'chipy',
'view_count': int,
'like_count': int,
'comment_count': int,
},
}
}, {
'url': 'http://www.trilulilu.ro/adena-ft-morreti-inocenta',
'md5': '929dfb8729dc71750463af88bbbbf4a4',
'info_dict': {
'id': 'f299710e3c91c5',
'ext': 'mp4',
'title': 'Adena ft. Morreti - Inocenta',
'description': 'pop music',
'uploader_id': 'VEVOmixt',
'upload_date': '20151204',
'uploader': 'VEVOmixt',
'timestamp': 1449187937,
'view_count': int,
'like_count': int,
'comment_count': int,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
media_info = self._download_json('http://m.trilulilu.ro/%s?format=json' % display_id, display_id)
if re.search(r'Fişierul nu este disponibil pentru vizionare în ţara dumneavoastră', webpage):
raise ExtractorError(
'This video is not available in your country.', expected=True)
elif re.search('Fişierul poate fi accesat doar de către prietenii lui', webpage):
age_limit = 0
errors = media_info.get('errors', {})
if errors.get('friends'):
raise ExtractorError('This video is private.', expected=True)
elif errors.get('geoblock'):
raise ExtractorError('This video is not available in your country.', expected=True)
elif errors.get('xxx_unlogged'):
age_limit = 18
flashvars_str = self._search_regex(
r'block_flash_vars\s*=\s*(\{[^\}]+\})', webpage, 'flashvars', fatal=False, default=None)
media_class = media_info.get('class')
if media_class not in ('video', 'audio'):
raise ExtractorError('not a video or an audio')
if flashvars_str:
flashvars = self._parse_json(flashvars_str, display_id)
user = media_info.get('user', {})
thumbnail = media_info.get('cover_url')
if thumbnail:
thumbnail.format(width='1600', height='1200')
# TODO: get correct ext for audio files
stream_type = media_info.get('stream_type')
formats = [{
'url': media_info['href'],
'ext': stream_type,
}]
if media_info.get('is_hd'):
formats.append({
'format_id': 'hd',
'url': media_info['hrefhd'],
'ext': stream_type,
})
if media_class == 'audio':
formats[0]['vcodec'] = 'none'
else:
raise ExtractorError(
'This page does not contain videos', expected=True)
if flashvars['isMP3'] == 'true':
raise ExtractorError(
'Audio downloads are currently not supported', expected=True)
video_id = flashvars['hash']
title = self._og_search_title(webpage)
thumbnail = self._og_search_thumbnail(webpage)
description = self._og_search_description(webpage, default=None)
format_url = ('http://fs%(server)s.trilulilu.ro/%(hash)s/'
'video-formats2' % flashvars)
format_doc = self._download_xml(
format_url, video_id,
note='Downloading formats',
errnote='Error while downloading formats')
video_url_template = (
'http://fs%(server)s.trilulilu.ro/stream.php?type=video'
'&source=site&hash=%(hash)s&username=%(userid)s&'
'key=ministhebest&format=%%s&sig=&exp=' %
flashvars)
formats = [
{
'format_id': fnode.text.partition('-')[2],
'url': video_url_template % fnode.text,
'ext': fnode.text.partition('-')[0]
}
for fnode in format_doc.findall('./formats/format')
]
formats[0]['format_id'] = 'sd'
return {
'id': video_id,
'id': media_info['identifier'].split('|')[1],
'display_id': display_id,
'formats': formats,
'title': title,
'description': description,
'title': media_info['title'],
'description': media_info.get('description'),
'thumbnail': thumbnail,
'uploader_id': user.get('username'),
'uploader': user.get('fullname'),
'timestamp': parse_iso8601(media_info.get('published'), ' '),
'duration': int_or_none(media_info.get('duration')),
'view_count': int_or_none(media_info.get('count_views')),
'like_count': int_or_none(media_info.get('count_likes')),
'comment_count': int_or_none(media_info.get('count_comments')),
'age_limit': age_limit,
}

View File

@@ -1,14 +1,15 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
sanitized_Request,
)
@@ -18,6 +19,8 @@ class UdemyIE(InfoExtractor):
_VALID_URL = r'https?://www\.udemy\.com/(?:[^#]+#/lecture/|lecture/view/?\?lectureId=)(?P<id>\d+)'
_LOGIN_URL = 'https://www.udemy.com/join/login-popup/?displayType=ajax&showSkipButton=1'
_ORIGIN_URL = 'https://www.udemy.com'
_SUCCESSFULLY_ENROLLED = '>You have enrolled in this course!<'
_ALREADY_ENROLLED = '>You are already taking this course.<'
_NETRC_MACHINE = 'udemy'
_TESTS = [{
@@ -33,6 +36,29 @@ class UdemyIE(InfoExtractor):
'skip': 'Requires udemy account credentials',
}]
def _enroll_course(self, webpage, course_id):
enroll_url = self._search_regex(
r'href=(["\'])(?P<url>https?://(?:www\.)?udemy\.com/course/subscribe/.+?)\1',
webpage, 'enroll url', group='url',
default='https://www.udemy.com/course/subscribe/?courseId=%s' % course_id)
webpage = self._download_webpage(enroll_url, course_id, 'Enrolling in the course')
if self._SUCCESSFULLY_ENROLLED in webpage:
self.to_screen('%s: Successfully enrolled in' % course_id)
elif self._ALREADY_ENROLLED in webpage:
self.to_screen('%s: Already enrolled in' % course_id)
def _download_lecture(self, course_id, lecture_id):
return self._download_json(
'https://www.udemy.com/api-2.0/users/me/subscribed-courses/%s/lectures/%s?%s' % (
course_id, lecture_id, compat_urllib_parse.urlencode({
'video_only': '',
'auto_play': '',
'fields[lecture]': 'title,description,asset',
'fields[asset]': 'asset_type,stream_url,thumbnail_url,download_urls,data',
'instructorPreviewMode': 'False',
})),
lecture_id, 'Downloading lecture JSON')
def _handle_error(self, response):
if not isinstance(response, dict):
return
@@ -54,6 +80,7 @@ class UdemyIE(InfoExtractor):
headers['X-Udemy-Client-Id'] = cookie.value
elif cookie.name == 'access_token':
headers['X-Udemy-Bearer-Token'] = cookie.value
headers['X-Udemy-Authorization'] = 'Bearer %s' % cookie.value
if isinstance(url_or_request, compat_urllib_request.Request):
for header, value in headers.items():
@@ -71,7 +98,7 @@ class UdemyIE(InfoExtractor):
def _login(self):
(username, password) = self._get_login_info()
if username is None:
self.raise_login_required('Udemy account is required')
return
login_popup = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login popup')
@@ -109,44 +136,76 @@ class UdemyIE(InfoExtractor):
def _real_extract(self, url):
lecture_id = self._match_id(url)
lecture = self._download_json(
'https://www.udemy.com/api-1.1/lectures/%s' % lecture_id,
lecture_id, 'Downloading lecture JSON')
webpage = self._download_webpage(url, lecture_id)
asset_type = lecture.get('assetType') or lecture.get('asset_type')
course_id = self._search_regex(
r'data-course-id=["\'](\d+)', webpage, 'course id')
try:
lecture = self._download_lecture(course_id, lecture_id)
except ExtractorError as e:
# Error could possibly mean we are not enrolled in the course
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
self._enroll_course(webpage, course_id)
lecture_id = self._download_lecture(course_id, lecture_id)
else:
raise
title = lecture['title']
description = lecture.get('description')
asset = lecture['asset']
asset_type = asset.get('assetType') or asset.get('asset_type')
if asset_type != 'Video':
raise ExtractorError(
'Lecture %s is not a video' % lecture_id, expected=True)
asset = lecture['asset']
stream_url = asset.get('streamUrl') or asset.get('stream_url')
mobj = re.search(r'(https?://www\.youtube\.com/watch\?v=.*)', stream_url)
if mobj:
return self.url_result(mobj.group(1), 'Youtube')
if stream_url:
youtube_url = self._search_regex(
r'(https?://www\.youtube\.com/watch\?v=.*)', stream_url, 'youtube URL', default=None)
if youtube_url:
return self.url_result(youtube_url, 'Youtube')
video_id = asset['id']
thumbnail = asset.get('thumbnailUrl') or asset.get('thumbnail_url')
duration = asset['data']['duration']
duration = float_or_none(asset.get('data', {}).get('duration'))
outputs = asset.get('data', {}).get('outputs', {})
download_url = asset.get('downloadUrl') or asset.get('download_url')
formats = []
for format_ in asset.get('download_urls', {}).get('Video', []):
video_url = format_.get('file')
if not video_url:
continue
format_id = format_.get('label')
f = {
'url': format_['file'],
'height': int_or_none(format_id),
}
if format_id:
# Some videos contain additional metadata (e.g.
# https://www.udemy.com/ios9-swift/learn/#/lecture/3383208)
output = outputs.get(format_id)
if isinstance(output, dict):
f.update({
'format_id': '%sp' % (output.get('label') or format_id),
'width': int_or_none(output.get('width')),
'height': int_or_none(output.get('height')),
'vbr': int_or_none(output.get('video_bitrate_in_kbps')),
'vcodec': output.get('video_codec'),
'fps': int_or_none(output.get('frame_rate')),
'abr': int_or_none(output.get('audio_bitrate_in_kbps')),
'acodec': output.get('audio_codec'),
'asr': int_or_none(output.get('audio_sample_rate')),
'tbr': int_or_none(output.get('total_bitrate_in_kbps')),
'filesize': int_or_none(output.get('file_size_in_bytes')),
})
else:
f['format_id'] = '%sp' % format_id
formats.append(f)
video = download_url.get('Video') or download_url.get('video')
video_480p = download_url.get('Video480p') or download_url.get('video_480p')
formats = [
{
'url': video_480p[0],
'format_id': '360p',
},
{
'url': video[0],
'format_id': '720p',
},
]
title = lecture['title']
description = lecture['description']
self._sort_formats(formats)
return {
'id': video_id,
@@ -160,9 +219,7 @@ class UdemyIE(InfoExtractor):
class UdemyCourseIE(UdemyIE):
IE_NAME = 'udemy:course'
_VALID_URL = r'https?://www\.udemy\.com/(?P<coursepath>[\da-z-]+)'
_SUCCESSFULLY_ENROLLED = '>You have enrolled in this course!<'
_ALREADY_ENROLLED = '>You are already taking this course.<'
_VALID_URL = r'https?://www\.udemy\.com/(?P<id>[\da-z-]+)'
_TESTS = []
@classmethod
@@ -170,24 +227,18 @@ class UdemyCourseIE(UdemyIE):
return False if UdemyIE.suitable(url) else super(UdemyCourseIE, cls).suitable(url)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
course_path = mobj.group('coursepath')
course_path = self._match_id(url)
webpage = self._download_webpage(url, course_path)
response = self._download_json(
'https://www.udemy.com/api-1.1/courses/%s' % course_path,
course_path, 'Downloading course JSON')
course_id = int(response['id'])
course_title = response['title']
course_id = response['id']
course_title = response.get('title')
webpage = self._download_webpage(
'https://www.udemy.com/course/subscribe/?courseId=%s' % course_id,
course_id, 'Enrolling in the course')
if self._SUCCESSFULLY_ENROLLED in webpage:
self.to_screen('%s: Successfully enrolled in' % course_id)
elif self._ALREADY_ENROLLED in webpage:
self.to_screen('%s: Already enrolled in' % course_id)
self._enroll_course(webpage, course_id)
response = self._download_json(
'https://www.udemy.com/api-1.1/courses/%s/curriculum' % course_id,

View File

@@ -3,7 +3,10 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_etree_fromstring
from ..compat import (
compat_etree_fromstring,
compat_urlparse,
)
from ..utils import (
ExtractorError,
int_or_none,
@@ -67,6 +70,17 @@ class VevoIE(InfoExtractor):
'params': {
'skip_download': 'true',
}
}, {
'note': 'No video_info',
'url': 'http://www.vevo.com/watch/k-camp-1/Till-I-Die/USUV71503000',
'md5': '8b83cc492d72fc9cf74a02acee7dc1b0',
'info_dict': {
'id': 'USUV71503000',
'ext': 'mp4',
'title': 'Till I Die - K Camp ft. T.I.',
'duration': 193,
},
'expected_warnings': ['Unable to download SMIL file'],
}]
_SMIL_BASE_URL = 'http://smil.lvl3.vevo.com/'
@@ -81,11 +95,17 @@ class VevoIE(InfoExtractor):
if webpage is False:
self._oauth_token = None
else:
if 'THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION' in webpage:
raise ExtractorError('%s said: This page is currently unavailable in your region.' % self.IE_NAME, expected=True)
self._oauth_token = self._search_regex(
r'access_token":\s*"([^"]+)"',
webpage, 'access token', fatal=False)
def _formats_from_json(self, video_info):
if not video_info:
return []
last_version = {'version': -1}
for version in video_info['videoVersions']:
# These are the HTTP downloads, other types are for different manifests
@@ -110,9 +130,8 @@ class VevoIE(InfoExtractor):
})
return formats
def _formats_from_smil(self, smil_xml):
def _formats_from_smil(self, smil_doc):
formats = []
smil_doc = compat_etree_fromstring(smil_xml.encode('utf-8'))
els = smil_doc.findall('.//{http://www.w3.org/2001/SMIL20/Language}video')
for el in els:
src = el.attrib['src']
@@ -145,14 +164,14 @@ class VevoIE(InfoExtractor):
})
return formats
def _download_api_formats(self, video_id):
def _download_api_formats(self, video_id, video_url):
if not self._oauth_token:
self._downloader.report_warning(
'No oauth token available, skipping API HLS download')
return []
api_url = 'https://apiv2.vevo.com/video/%s/streams/hls?token=%s' % (
video_id, self._oauth_token)
api_url = compat_urlparse.urljoin(video_url, '//apiv2.vevo.com/video/%s/streams/hls?token=%s' % (
video_id, self._oauth_token))
api_data = self._download_json(
api_url, video_id,
note='Downloading HLS formats',
@@ -166,18 +185,26 @@ class VevoIE(InfoExtractor):
preference=0)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = None
json_url = 'http://videoplayer.vevo.com/VideoService/AuthenticateVideo?isrc=%s' % video_id
response = self._download_json(json_url, video_id)
video_info = response['video']
video_info = response['video'] or {}
if not video_info:
if not video_info and response.get('statusCode') != 909:
if 'statusMessage' in response:
raise ExtractorError('%s said: %s' % (self.IE_NAME, response['statusMessage']), expected=True)
raise ExtractorError('Unable to extract videos')
if not video_info:
if url.startswith('vevo:'):
raise ExtractorError('Please specify full Vevo URL for downloading', expected=True)
webpage = self._download_webpage(url, video_id)
title = video_info.get('title') or self._og_search_title(webpage)
formats = self._formats_from_json(video_info)
is_explicit = video_info.get('isExplicit')
@@ -189,11 +216,11 @@ class VevoIE(InfoExtractor):
age_limit = None
# Download via HLS API
formats.extend(self._download_api_formats(video_id))
formats.extend(self._download_api_formats(video_id, url))
# Download SMIL
smil_blocks = sorted((
f for f in video_info['videoVersions']
f for f in video_info.get('videoVersions', [])
if f['sourceType'] == 13),
key=lambda f: f['version'])
smil_url = '%s/Video/V2/VFILE/%s/%sr.smil' % (
@@ -205,23 +232,26 @@ class VevoIE(InfoExtractor):
if smil_url_m is not None:
smil_url = smil_url_m
if smil_url:
smil_xml = self._download_webpage(
smil_url, video_id, 'Downloading SMIL info', fatal=False)
if smil_xml:
formats.extend(self._formats_from_smil(smil_xml))
smil_doc = self._download_smil(smil_url, video_id, fatal=False)
if smil_doc:
formats.extend(self._formats_from_smil(smil_doc))
self._sort_formats(formats)
timestamp_ms = int_or_none(self._search_regex(
timestamp = int_or_none(self._search_regex(
r'/Date\((\d+)\)/',
video_info['launchDate'], 'launch date', fatal=False))
video_info['launchDate'], 'launch date', fatal=False),
scale=1000) if video_info else None
duration = video_info.get('duration') or int_or_none(
self._html_search_meta('video:duration', webpage))
return {
'id': video_id,
'title': video_info['title'],
'title': title,
'formats': formats,
'thumbnail': video_info['imageUrl'],
'timestamp': timestamp_ms // 1000,
'uploader': video_info['mainArtists'][0]['artistName'],
'duration': video_info['duration'],
'thumbnail': video_info.get('imageUrl'),
'timestamp': timestamp,
'uploader': video_info['mainArtists'][0]['artistName'] if video_info else None,
'duration': duration,
'age_limit': age_limit,
}

View File

@@ -15,6 +15,7 @@ class ViceIE(InfoExtractor):
'id': '43cW1mYzpia9IlestBjVpd23Yu3afAfp',
'ext': 'mp4',
'title': 'VICE_COWBOYCAPITALISTS_PART01_v1_VICE_WM_1080p.mov',
'duration': 725.983,
},
'params': {
# Requires ffmpeg (m3u8 manifest)

View File

@@ -1,26 +0,0 @@
from __future__ import unicode_literals
from .novamov import NovaMovIE
class VideoWeedIE(NovaMovIE):
IE_NAME = 'videoweed'
IE_DESC = 'VideoWeed'
_VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'videoweed\.(?:es|com)'}
_HOST = 'www.videoweed.es'
_FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
_TITLE_REGEX = r'<h1 class="text_shadow">([^<]+)</h1>'
_TEST = {
'url': 'http://www.videoweed.es/file/b42178afbea14',
'md5': 'abd31a2132947262c50429e1d16c1bfd',
'info_dict': {
'id': 'b42178afbea14',
'ext': 'flv',
'title': 'optical illusion dissapeared image magic illusion',
'description': ''
},
}

View File

@@ -18,6 +18,7 @@ from ..utils import (
unified_strdate,
)
from .vimeo import VimeoIE
from .pladform import PladformIE
class VKIE(InfoExtractor):
@@ -164,6 +165,11 @@ class VKIE(InfoExtractor):
# vk wrapper
'url': 'http://www.biqle.ru/watch/847655_160197695',
'only_matching': True,
},
{
# pladform embed
'url': 'https://vk.com/video-76116461_171554880',
'only_matching': True,
}
]
@@ -254,10 +260,13 @@ class VKIE(InfoExtractor):
if vimeo_url is not None:
return self.url_result(vimeo_url)
pladform_url = PladformIE._extract_url(info_page)
if pladform_url:
return self.url_result(pladform_url)
m_rutube = re.search(
r'\ssrc="((?:https?:)?//rutube\.ru\\?/video\\?/embed(?:.*?))\\?"', info_page)
if m_rutube is not None:
self.to_screen('rutube video detected')
rutube_url = self._proto_relative_url(
m_rutube.group(1).replace('\\', ''))
return self.url_result(rutube_url)

View File

@@ -3,11 +3,14 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import sanitized_Request
from ..utils import (
ExtractorError,
sanitized_Request,
)
class VodlockerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vodlocker\.com/(?P<id>[0-9a-zA-Z]+)(?:\..*?)?'
_VALID_URL = r'https?://(?:www\.)?vodlocker\.com/(?:embed-)?(?P<id>[0-9a-zA-Z]+)(?:\..*?)?'
_TESTS = [{
'url': 'http://vodlocker.com/e8wvyzz4sl42',
@@ -24,6 +27,12 @@ class VodlockerIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if any(p in webpage for p in (
'>THIS FILE WAS DELETED<',
'>File Not Found<',
'The file you were looking for could not be found, sorry for any inconvenience.<')):
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
fields = self._hidden_inputs(webpage)
if fields['op'] == 'download1':

View File

@@ -10,8 +10,8 @@ from ..compat import (
compat_urlparse,
)
from ..utils import (
determine_ext,
unified_strdate,
qualities,
)
@@ -33,6 +33,7 @@ class WDRIE(InfoExtractor):
'params': {
'skip_download': True,
},
'skip': 'Page Not Found',
},
{
'url': 'http://www1.wdr.de/themen/av/videomargaspiegelisttot101-videoplayer.html',
@@ -47,6 +48,7 @@ class WDRIE(InfoExtractor):
'params': {
'skip_download': True,
},
'skip': 'Page Not Found',
},
{
'url': 'http://www1.wdr.de/themen/kultur/audioerlebtegeschichtenmargaspiegel100-audioplayer.html',
@@ -71,6 +73,7 @@ class WDRIE(InfoExtractor):
'upload_date': '20140717',
'is_live': False
},
'skip': 'Page Not Found',
},
{
'url': 'http://www1.wdr.de/mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100.html',
@@ -83,10 +86,10 @@ class WDRIE(InfoExtractor):
'url': 'http://www1.wdr.de/mediathek/video/livestream/index.html',
'info_dict': {
'id': 'mdb-103364',
'title': 're:^WDR Fernsehen [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'title': 're:^WDR Fernsehen Live [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:ae2ff888510623bf8d4b115f95a9b7c9',
'ext': 'flv',
'upload_date': '20150212',
'upload_date': '20150101',
'is_live': True
},
'params': {
@@ -150,25 +153,52 @@ class WDRIE(InfoExtractor):
if upload_date:
upload_date = unified_strdate(upload_date)
formats = []
preference = qualities(['S', 'M', 'L', 'XL'])
if video_url.endswith('.f4m'):
video_url += '?hdcore=3.2.0&plugin=aasp-3.2.0.77.18'
ext = 'flv'
f4m_formats = self._extract_f4m_formats(video_url + '?hdcore=3.2.0&plugin=aasp-3.2.0.77.18', page_id, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
elif video_url.endswith('.smil'):
fmt = self._extract_smil_formats(video_url, page_id)[0]
video_url = fmt['url']
sep = '&' if '?' in video_url else '?'
video_url += sep
video_url += 'hdcore=3.3.0&plugin=aasp-3.3.0.99.43'
ext = fmt['ext']
smil_formats = self._extract_smil_formats(video_url, page_id, False, {
'hdcore': '3.3.0',
'plugin': 'aasp-3.3.0.99.43',
})
if smil_formats:
formats.extend(smil_formats)
else:
ext = determine_ext(video_url)
formats.append({
'url': video_url,
'http_headers': {
'User-Agent': 'mobile',
},
})
m3u8_url = self._search_regex(r'rel="adaptiv"[^>]+href="([^"]+)"', webpage, 'm3u8 url', default=None)
if m3u8_url:
m3u8_formats = self._extract_m3u8_formats(m3u8_url, page_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
direct_urls = re.findall(r'rel="web(S|M|L|XL)"[^>]+href="([^"]+)"', webpage)
if direct_urls:
for quality, video_url in direct_urls:
formats.append({
'url': video_url,
'preference': preference(quality),
'http_headers': {
'User-Agent': 'mobile',
},
})
self._sort_formats(formats)
description = self._html_search_meta('Description', webpage, 'description')
return {
'id': page_id,
'url': video_url,
'ext': ext,
'formats': formats,
'title': title,
'description': description,
'thumbnail': thumbnail,

View File

@@ -5,7 +5,7 @@ from .youtube import YoutubeIE
class WimpIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?wimp\.com/(?P<id>[^/]+)/'
_VALID_URL = r'http://(?:www\.)?wimp\.com/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.wimp.com/maruexhausted/',
'md5': 'ee21217ffd66d058e8b16be340b74883',
@@ -28,18 +28,23 @@ class WimpIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
[r"[\"']file[\"']\s*[:,]\s*[\"'](.+?)[\"']", r"videoId\s*:\s*[\"']([^\"']+)[\"']"],
webpage, 'video URL')
if YoutubeIE.suitable(video_url):
self.to_screen('Found YouTube video')
youtube_id = self._search_regex(
r"videoId\s*:\s*[\"']([0-9A-Za-z_-]{11})[\"']",
webpage, 'video URL', default=None)
if youtube_id:
return {
'_type': 'url',
'url': video_url,
'url': youtube_id,
'ie_key': YoutubeIE.ie_key(),
}
video_url = self._search_regex(
r'<video[^>]+>\s*<source[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'video URL', group='url')
return {
'id': video_id,
'url': video_url,

View File

@@ -25,8 +25,8 @@ class YoukuIE(InfoExtractor):
'''
_TESTS = [{
# MD5 is unstable
'url': 'http://v.youku.com/v_show/id_XMTc1ODE5Njcy.html',
'md5': '5f3af4192eabacc4501508d54a8cabd7',
'info_dict': {
'id': 'XMTc1ODE5Njcy_part1',
'title': '★Smile﹗♡ Git Fresh -Booty Music舞蹈.',
@@ -42,6 +42,7 @@ class YoukuIE(InfoExtractor):
'title': '武媚娘传奇 85',
},
'playlist_count': 11,
'skip': 'Available in China only',
}, {
'url': 'http://v.youku.com/v_show/id_XMTI1OTczNDM5Mg==.html',
'info_dict': {
@@ -49,7 +50,6 @@ class YoukuIE(InfoExtractor):
'title': '花千骨 04',
},
'playlist_count': 13,
'skip': 'Available in China only',
}, {
'url': 'http://v.youku.com/v_show/id_XNjA1NzA2Njgw.html',
'note': 'Video protected with password',
@@ -63,7 +63,7 @@ class YoukuIE(InfoExtractor):
},
}]
def construct_video_urls(self, data1, data2):
def construct_video_urls(self, data):
# get sid, token
def yk_t(s1, s2):
ls = list(range(256))
@@ -81,34 +81,24 @@ class YoukuIE(InfoExtractor):
return bytes(s)
sid, token = yk_t(
b'becaf9be', base64.b64decode(data2['ep'].encode('ascii'))
b'becaf9be', base64.b64decode(data['security']['encrypt_string'].encode('ascii'))
).decode('ascii').split('_')
# get oip
oip = data2['ip']
# get fileid
string_ls = list(
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890')
shuffled_string_ls = []
seed = data1['seed']
N = len(string_ls)
for ii in range(N):
seed = (seed * 0xd3 + 0x754f) % 0x10000
idx = seed * len(string_ls) // 0x10000
shuffled_string_ls.append(string_ls[idx])
del string_ls[idx]
oip = data['security']['ip']
fileid_dict = {}
for format in data1['streamtypes']:
streamfileid = [
int(i) for i in data1['streamfileids'][format].strip('*').split('*')]
fileid = ''.join(
[shuffled_string_ls[i] for i in streamfileid])
fileid_dict[format] = fileid[:8] + '%s' + fileid[10:]
for stream in data['stream']:
format = stream.get('stream_type')
fileid = stream['stream_fileid']
fileid_dict[format] = fileid
def get_fileid(format, n):
fileid = fileid_dict[format] % hex(int(n))[2:].upper().zfill(2)
number = hex(int(str(n), 10))[2:].upper()
if len(number) == 1:
number = '0' + number
streamfileids = fileid_dict[format]
fileid = streamfileids[0:8] + number + streamfileids[10:]
return fileid
# get ep
@@ -123,15 +113,15 @@ class YoukuIE(InfoExtractor):
# generate video_urls
video_urls_dict = {}
for format in data1['streamtypes']:
for stream in data['stream']:
format = stream.get('stream_type')
video_urls = []
for dt in data1['segs'][format]:
n = str(int(dt['no']))
for dt in stream['segs']:
n = str(stream['segs'].index(dt))
param = {
'K': dt['k'],
'K': dt['key'],
'hd': self.get_hd(format),
'myp': 0,
'ts': dt['seconds'],
'ypp': 0,
'ctype': 12,
'ev': 1,
@@ -142,7 +132,7 @@ class YoukuIE(InfoExtractor):
video_url = \
'http://k.youku.com/player/getFlvPath/' + \
'sid/' + sid + \
'_' + str(int(n) + 1).zfill(2) + \
'_00' + \
'/st/' + self.parse_ext_l(format) + \
'/fileid/' + get_fileid(format, n) + '?' + \
compat_urllib_parse.urlencode(param)
@@ -153,23 +143,31 @@ class YoukuIE(InfoExtractor):
def get_hd(self, fm):
hd_id_dict = {
'3gp': '0',
'3gphd': '1',
'flv': '0',
'flvhd': '0',
'mp4': '1',
'mp4hd': '1',
'mp4hd2': '1',
'mp4hd3': '1',
'hd2': '2',
'hd3': '3',
'3gp': '0',
'3gphd': '1'
}
return hd_id_dict[fm]
def parse_ext_l(self, fm):
ext_dict = {
'3gp': 'flv',
'3gphd': 'mp4',
'flv': 'flv',
'flvhd': 'flv',
'mp4': 'mp4',
'mp4hd': 'mp4',
'mp4hd2': 'flv',
'mp4hd3': 'flv',
'hd2': 'flv',
'hd3': 'flv',
'3gp': 'flv',
'3gphd': 'mp4'
}
return ext_dict[fm]
@@ -178,9 +176,13 @@ class YoukuIE(InfoExtractor):
'3gp': 'h6',
'3gphd': 'h5',
'flv': 'h4',
'flvhd': 'h4',
'mp4': 'h3',
'mp4hd': 'h3',
'mp4hd2': 'h4',
'mp4hd3': 'h4',
'hd2': 'h2',
'hd3': 'h1'
'hd3': 'h1',
}
return _dict[fm]
@@ -188,45 +190,46 @@ class YoukuIE(InfoExtractor):
video_id = self._match_id(url)
def retrieve_data(req_url, note):
req = sanitized_Request(req_url)
headers = {
'Referer': req_url,
}
self._set_cookie('youku.com', 'xreferrer', 'http://www.youku.com')
req = sanitized_Request(req_url, headers=headers)
cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
if cn_verification_proxy:
req.add_header('Ytdl-request-proxy', cn_verification_proxy)
raw_data = self._download_json(req, video_id, note=note)
return raw_data['data'][0]
return raw_data['data']
video_password = self._downloader.params.get('videopassword', None)
# request basic data
basic_data_url = 'http://v.youku.com/player/getPlayList/VideoIDS/%s' % video_id
basic_data_url = "http://play.youku.com/play/get.json?vid=%s&ct=12" % video_id
if video_password:
basic_data_url += '?password=%s' % video_password
basic_data_url += '&pwd=%s' % video_password
data1 = retrieve_data(
basic_data_url,
'Downloading JSON metadata 1')
data2 = retrieve_data(
'http://v.youku.com/player/getPlayList/VideoIDS/%s/Pf/4/ctype/12/ev/1' % video_id,
'Downloading JSON metadata 2')
data = retrieve_data(basic_data_url, 'Downloading JSON metadata')
error_code = data1.get('error_code')
if error_code:
error = data1.get('error')
if error is not None and '因版权原因无法观看此视频' in error:
error = data.get('error')
if error:
error_note = error.get('note')
if error_note is not None and '因版权原因无法观看此视频' in error_note:
raise ExtractorError(
'Youku said: Sorry, this video is available in China only', expected=True)
else:
msg = 'Youku server reported error %i' % error_code
msg = 'Youku server reported error %i' % error.get('code')
if error is not None:
msg += ': ' + error
msg += ': ' + error_note
raise ExtractorError(msg)
title = data1['title']
# get video title
title = data['video']['title']
# generate video_urls_dict
video_urls_dict = self.construct_video_urls(data1, data2)
video_urls_dict = self.construct_video_urls(data)
# construct info
entries = [{
@@ -235,10 +238,11 @@ class YoukuIE(InfoExtractor):
'formats': [],
# some formats are not available for all parts, we have to detect
# which one has all
} for i in range(max(len(v) for v in data1['segs'].values()))]
for fm in data1['streamtypes']:
} for i in range(max(len(v.get('segs')) for v in data['stream']))]
for stream in data['stream']:
fm = stream.get('stream_type')
video_urls = video_urls_dict[fm]
for video_url, seg, entry in zip(video_urls, data1['segs'][fm], entries):
for video_url, seg, entry in zip(video_urls, stream['segs'], entries):
entry['formats'].append({
'url': video_url,
'format_id': self.get_format_name(fm),

View File

@@ -258,7 +258,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|(?: # or the v= param in all its forms
(?:(?:watch|movie)(?:_popup)?(?:\.php)?/?)? # preceding watch(_popup|.php) or nothing (like /?v=xxxx)
(?:\?|\#!?) # the params delimiter ? or # or #!
(?:.*?&)?? # any other preceding param (like /?s=tuff&v=xxxx)
(?:.*?[&;])?? # any other preceding param (like /?s=tuff&v=xxxx or ?s=tuff&amp;v=V36LpHqtcDY)
v=
)
))
@@ -346,6 +346,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'247': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40},
'248': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40},
'271': {'ext': 'webm', 'height': 1440, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40},
# itag 272 videos are either 3840x2160 (e.g. RtoitU2A-3E) or 7680x4320 (sLprVF6d7Ug)
'272': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40},
'302': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40, 'fps': 60, 'vcodec': 'vp9'},
'303': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'acodec': 'none', 'preference': -40, 'fps': 60, 'vcodec': 'vp9'},
@@ -714,6 +715,26 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'url': 'https://www.youtube.com/watch?v=Ms7iBXnlUO8',
'only_matching': True,
},
{
# Video with yt:stretch=17:0
'url': 'https://www.youtube.com/watch?v=Q39EVAstoRM',
'info_dict': {
'id': 'Q39EVAstoRM',
'ext': 'mp4',
'title': 'Clash Of Clans#14 Dicas De Ataque Para CV 4',
'description': 'md5:ee18a25c350637c8faff806845bddee9',
'upload_date': '20151107',
'uploader_id': 'UCCr7TALkRbo3EtFzETQF1LA',
'uploader': 'CH GAMER DROID',
},
'params': {
'skip_download': True,
},
},
{
'url': 'https://www.youtube.com/watch?feature=player_embedded&amp;amp;v=V36LpHqtcDY',
'only_matching': True,
}
]
def __init__(self, *args, **kwargs):
@@ -1459,6 +1480,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
manifest_url = video_info['hlsvp'][0]
url_map = self._extract_from_m3u8(manifest_url, video_id)
formats = _map_to_format_list(url_map)
# Accept-Encoding header causes failures in live streams on Youtube and Youtube Gaming
for a_format in formats:
a_format.setdefault('http_headers', {})['Youtubedl-no-compression'] = 'True'
else:
raise ExtractorError('no conn, hlsvp or url_encoded_fmt_stream_map information found in video info')
@@ -1496,10 +1520,15 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
r'<meta\s+property="og:video:tag".*?content="yt:stretch=(?P<w>[0-9]+):(?P<h>[0-9]+)">',
video_webpage)
if stretched_m:
ratio = float(stretched_m.group('w')) / float(stretched_m.group('h'))
for f in formats:
if f.get('vcodec') != 'none':
f['stretched_ratio'] = ratio
w = float(stretched_m.group('w'))
h = float(stretched_m.group('h'))
# yt:stretch may hold invalid ratio data (e.g. for Q39EVAstoRM ratio is 17:0).
# We will only process correct ratios.
if w > 0 and h > 0:
ratio = w / h
for f in formats:
if f.get('vcodec') != 'none':
f['stretched_ratio'] = ratio
self._sort_formats(formats)
@@ -1538,7 +1567,7 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor, YoutubePlaylistBaseInfoExtract
youtube\.com/
(?:
(?:course|view_play_list|my_playlists|artist|playlist|watch|embed/videoseries)
\? (?:.*?&)*? (?:p|a|list)=
\? (?:.*?[&;])*? (?:p|a|list)=
| p/
)
(

View File

@@ -214,7 +214,7 @@ class JSInterpreter(object):
obj = {}
obj_m = re.search(
(r'(?:var\s+)?%s\s*=\s*\{' % re.escape(objname)) +
r'\s*(?P<fields>([a-zA-Z$0-9]+\s*:\s*function\(.*?\)\s*\{.*?\})*)' +
r'\s*(?P<fields>([a-zA-Z$0-9]+\s*:\s*function\(.*?\)\s*\{.*?\}(?:,\s*)?)*)' +
r'\}\s*;',
self.code)
fields = obj_m.group('fields')

View File

@@ -338,7 +338,7 @@ def parseOpts(overrideArguments=None):
video_format.add_option(
'-F', '--list-formats',
action='store_true', dest='listformats',
help='List all available formats')
help='List all available formats of requested videos')
video_format.add_option(
'--youtube-include-dash-manifest',
action='store_true', dest='youtube_include_dash_manifest', default=True,

View File

@@ -52,7 +52,7 @@ class FFmpegPostProcessor(PostProcessor):
def _determine_executables(self):
programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
prefer_ffmpeg = self._downloader.params.get('prefer_ffmpeg', False)
prefer_ffmpeg = False
self.basename = None
self.probe_basename = None
@@ -60,6 +60,7 @@ class FFmpegPostProcessor(PostProcessor):
self._paths = None
self._versions = None
if self._downloader:
prefer_ffmpeg = self._downloader.params.get('prefer_ffmpeg', False)
location = self._downloader.params.get('ffmpeg_location')
if location is not None:
if not os.path.exists(location):

View File

@@ -663,6 +663,16 @@ def _create_http_connection(ydl_handler, http_class, is_https, *args, **kwargs):
return hc
def handle_youtubedl_headers(headers):
filtered_headers = headers
if 'Youtubedl-no-compression' in filtered_headers:
filtered_headers = dict((k, v) for k, v in filtered_headers.items() if k.lower() != 'accept-encoding')
del filtered_headers['Youtubedl-no-compression']
return filtered_headers
class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
"""Handler for HTTP requests and responses.
@@ -670,7 +680,7 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
the standard headers to every HTTP request and handles gzipped and
deflated responses from web servers. If compression is to be avoided in
a particular request, the original request in the program code only has
to include the HTTP header "Youtubedl-No-Compression", which will be
to include the HTTP header "Youtubedl-no-compression", which will be
removed before making the real request.
Part of this code was copied from:
@@ -731,10 +741,8 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
# The dict keys are capitalized because of this bug by urllib
if h.capitalize() not in req.headers:
req.add_header(h, v)
if 'Youtubedl-no-compression' in req.headers:
if 'Accept-encoding' in req.headers:
del req.headers['Accept-encoding']
del req.headers['Youtubedl-no-compression']
req.headers = handle_youtubedl_headers(req.headers)
if sys.version_info < (2, 7) and '#' in req.get_full_url():
# Python 2.6 is brain-dead when it comes to fragments

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2015.11.23'
__version__ = '2015.12.13'