mirror of
https://github.com/ytdl-org/youtube-dl.git
synced 2025-12-12 09:02:44 +01:00
Compare commits
89 Commits
2016.10.16
...
2016.11.02
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
8956d6608a | ||
|
|
3365ea8929 | ||
|
|
a18aeee803 | ||
|
|
1616f9b452 | ||
|
|
02dc0a36b7 | ||
|
|
639e3b5c99 | ||
|
|
b2758123c5 | ||
|
|
f449c061d0 | ||
|
|
9c82bba05d | ||
|
|
e3577722b0 | ||
|
|
b82c33dd67 | ||
|
|
e5a088dc4b | ||
|
|
2c6da7df4a | ||
|
|
7e7a028aa4 | ||
|
|
e70a5e6566 | ||
|
|
3bf55be466 | ||
|
|
a901fc5fc2 | ||
|
|
cae6bc0118 | ||
|
|
d9ee2e5cf6 | ||
|
|
e1a0b3b81c | ||
|
|
2a048f9878 | ||
|
|
ea331f40e6 | ||
|
|
f02700a1fa | ||
|
|
f3517569f6 | ||
|
|
c725333d41 | ||
|
|
a5a8877f9c | ||
|
|
43c53a1700 | ||
|
|
ec8705117a | ||
|
|
3d8d44c7b1 | ||
|
|
88839f4380 | ||
|
|
83e9374464 | ||
|
|
773017c648 | ||
|
|
777d90dc28 | ||
|
|
3791d84acc | ||
|
|
9305a0dc60 | ||
|
|
94e08950e3 | ||
|
|
ee824a8d06 | ||
|
|
d3b6b3b95b | ||
|
|
b17422753f | ||
|
|
b0b28b8241 | ||
|
|
81cb7a5978 | ||
|
|
d2e96a8ed4 | ||
|
|
2e7c8cab55 | ||
|
|
d7d4481c6a | ||
|
|
5ace137bf4 | ||
|
|
9dde0e04e6 | ||
|
|
f16f8505b1 | ||
|
|
9dc13a6780 | ||
|
|
9aa929d337 | ||
|
|
425f3fdfcb | ||
|
|
e034cbc581 | ||
|
|
5378f8ce0d | ||
|
|
b64d04c119 | ||
|
|
00ca755231 | ||
|
|
69c2d42bd7 | ||
|
|
062e2769a3 | ||
|
|
859447a28d | ||
|
|
f8ae2c7f30 | ||
|
|
9ce0077485 | ||
|
|
0ebb86bd18 | ||
|
|
9df6b03caf | ||
|
|
8e2915d70b | ||
|
|
19e447150d | ||
|
|
ad9fd84004 | ||
|
|
60633ae9a0 | ||
|
|
a81dc82151 | ||
|
|
9218a6b4f5 | ||
|
|
02af6ec707 | ||
|
|
05b7996cab | ||
|
|
46f6052950 | ||
|
|
c8802041dd | ||
|
|
c7911009a0 | ||
|
|
2b96b06bf0 | ||
|
|
06b3fe2926 | ||
|
|
2c6743bf0f | ||
|
|
efb6242916 | ||
|
|
0384932e3d | ||
|
|
edd6074cea | ||
|
|
791d29dbf8 | ||
|
|
481cc7335c | ||
|
|
853a71b628 | ||
|
|
e2628fb6a0 | ||
|
|
df4939b1cd | ||
|
|
0b94dbb115 | ||
|
|
8d76bdf12b | ||
|
|
8204bacf1d | ||
|
|
47da782337 | ||
|
|
74324a7ac2 | ||
|
|
b0dfcab60a |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.10.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.10.16**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.11.02*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.11.02**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.10.16
|
||||
[debug] youtube-dl version 2016.11.02
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
4
AUTHORS
4
AUTHORS
@@ -186,3 +186,7 @@ Sebastian Blunt
|
||||
Matěj Cepl
|
||||
Xie Yanbo
|
||||
Philip Xu
|
||||
John Hawkinson
|
||||
Rich Leeper
|
||||
Zhong Jianxin
|
||||
Thor77
|
||||
|
||||
@@ -245,7 +245,7 @@ Say `meta` from the previous example has a `title` and you are about to extract
|
||||
title = meta['title']
|
||||
```
|
||||
|
||||
If `title` disappeares from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
|
||||
If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
|
||||
|
||||
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
|
||||
|
||||
|
||||
89
ChangeLog
89
ChangeLog
@@ -1,3 +1,92 @@
|
||||
version 2016.11.02
|
||||
|
||||
Core
|
||||
+ Add basic support for Smooth Streaming protocol (#8118, #10969)
|
||||
* Improve MPD manifest base URL extraction (#10909, #11079)
|
||||
* Fix --match-filter for int-like strings (#11082)
|
||||
|
||||
Extractors
|
||||
+ [mva] Add support for ISM formats
|
||||
+ [msn] Add support for ISM formats
|
||||
+ [onet] Add support for ISM formats
|
||||
+ [tvp] Add support for ISM formats
|
||||
+ [nicknight] Add support for nicknight sites (#10769)
|
||||
|
||||
|
||||
version 2016.10.30
|
||||
|
||||
Extractors
|
||||
* [facebook] Improve 1080P video detection (#11073)
|
||||
* [imgur] Recognize /r/ URLs (#11071)
|
||||
* [beeg] Fix extraction (#11069)
|
||||
* [openload] Fix extraction (#10408)
|
||||
* [gvsearch] Modernize and fix search request (#11051)
|
||||
* [adultswim] Fix extraction (#10979)
|
||||
+ [nobelprize] Add support for nobelprize.org (#9999)
|
||||
* [hornbunny] Fix extraction (#10981)
|
||||
* [tvp] Improve video id extraction (#10585)
|
||||
|
||||
|
||||
version 2016.10.26
|
||||
|
||||
Extractors
|
||||
+ [rentv] Add support for ren.tv (#10620)
|
||||
+ [ard] Detect unavailable videos (#11018)
|
||||
* [vk] Fix extraction (#11022)
|
||||
|
||||
|
||||
version 2016.10.25
|
||||
|
||||
Core
|
||||
* Running youtube-dl in the background is fixed (#10996, #10706, #955)
|
||||
|
||||
Extractors
|
||||
+ [jamendo] Add support for jamendo.com (#10132, #10736)
|
||||
+ [pandatv] Add support for panda.tv (#10736)
|
||||
+ [dotsub] Support Vimeo embed (#10964)
|
||||
* [litv] Fix extraction
|
||||
+ [vimeo] Delegate ondemand redirects to ondemand extractor (#10994)
|
||||
* [vivo] Fix extraction (#11003)
|
||||
+ [twitch:stream] Add support for rebroadcasts (#10995)
|
||||
* [pluralsight] Fix subtitles conversion (#10990)
|
||||
|
||||
|
||||
version 2016.10.21.1
|
||||
|
||||
Extractors
|
||||
+ [pluralsight] Process all clip URLs (#10984)
|
||||
|
||||
|
||||
version 2016.10.21
|
||||
|
||||
Core
|
||||
- Disable thumbnails embedding in mkv
|
||||
+ Add support for Comcast multiple-system operator (#10819)
|
||||
|
||||
Extractors
|
||||
* [pluralsight] Adapt to new API (#10972)
|
||||
* [openload] Fix extraction (#10408, #10971)
|
||||
+ [natgeo] Extract m3u8 formats (#10959)
|
||||
|
||||
|
||||
version 2016.10.19
|
||||
|
||||
Core
|
||||
+ [utils] Expose PACKED_CODES_RE
|
||||
+ [extractor/common] Extract non smil wowza mpd manifests
|
||||
+ [extractor/common] Detect f4m audio-only formats
|
||||
|
||||
Extractors
|
||||
* [vidzi] Fix extraction (#10908, #10952)
|
||||
* [urplay] Fix subtitles extraction
|
||||
+ [urplay] Add support for urskola.se (#10915)
|
||||
+ [orf] Add subtitles support (#10939)
|
||||
* [youtube] Fix --no-playlist behavior for youtu.be/id URLs (#10896)
|
||||
* [nrk] Relax URL regular expression (#10928)
|
||||
+ [nytimes] Add support for podcasts (#10926)
|
||||
* [pluralsight] Relax URL regular expression (#10941)
|
||||
|
||||
|
||||
version 2016.10.16
|
||||
|
||||
Core
|
||||
|
||||
@@ -728,7 +728,7 @@ Add a file exclusion for `youtube-dl.exe` in Windows Defender settings.
|
||||
|
||||
YouTube changed their playlist format in March 2014 and later on, so you'll need at least youtube-dl 2014.07.25 to download all YouTube videos.
|
||||
|
||||
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging guys](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
|
||||
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging people](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
|
||||
|
||||
### I'm getting an error when trying to use output template: `error: using output template conflicts with using title, video ID or auto number`
|
||||
|
||||
@@ -1083,7 +1083,7 @@ Say `meta` from the previous example has a `title` and you are about to extract
|
||||
title = meta['title']
|
||||
```
|
||||
|
||||
If `title` disappeares from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
|
||||
If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
|
||||
|
||||
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
|
||||
|
||||
|
||||
@@ -332,6 +332,8 @@
|
||||
- **ivideon**: Ivideon TV
|
||||
- **Iwara**
|
||||
- **Izlesene**
|
||||
- **Jamendo**
|
||||
- **JamendoAlbum**
|
||||
- **JeuxVideo**
|
||||
- **Jove**
|
||||
- **jpopsuki.tv**
|
||||
@@ -481,11 +483,13 @@
|
||||
- **nhl.com:videocenter:category**: NHL videocenter category
|
||||
- **nick.com**
|
||||
- **nick.de**
|
||||
- **nicknight**
|
||||
- **niconico**: ニコニコ動画
|
||||
- **NiconicoPlaylist**
|
||||
- **Nintendo**
|
||||
- **njoy**: N-JOY
|
||||
- **njoy:embed**
|
||||
- **NobelPrize**
|
||||
- **Noco**
|
||||
- **Normalboots**
|
||||
- **NosVideo**
|
||||
@@ -527,6 +531,7 @@
|
||||
- **orf:iptv**: iptv.ORF.at
|
||||
- **orf:oe1**: Radio Österreich 1
|
||||
- **orf:tvthek**: ORF TVthek
|
||||
- **PandaTV**: 熊猫TV
|
||||
- **pandora.tv**: 판도라TV
|
||||
- **parliamentlive.tv**: UK parliament videos
|
||||
- **Patreon**
|
||||
@@ -586,6 +591,8 @@
|
||||
- **RDS**: RDS.ca
|
||||
- **RedTube**
|
||||
- **RegioTV**
|
||||
- **RENTV**
|
||||
- **RENTVArticle**
|
||||
- **Restudy**
|
||||
- **Reuters**
|
||||
- **ReverbNation**
|
||||
@@ -641,7 +648,7 @@
|
||||
- **ServingSys**
|
||||
- **Sexu**
|
||||
- **Shahid**
|
||||
- **Shared**: shared.sx and vivo.sx
|
||||
- **Shared**: shared.sx
|
||||
- **ShareSix**
|
||||
- **Sina**
|
||||
- **SixPlay**
|
||||
@@ -848,6 +855,7 @@
|
||||
- **Vimple**: Vimple - one-click video hosting
|
||||
- **Vine**
|
||||
- **vine:user**
|
||||
- **Vivo**: vivo.sx
|
||||
- **vk**: VK
|
||||
- **vk:uservideos**: VK - User's Videos
|
||||
- **vk:wallpost**
|
||||
|
||||
@@ -605,6 +605,7 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
'extractor': 'TEST',
|
||||
'duration': 30,
|
||||
'filesize': 10 * 1024,
|
||||
'playlist_id': '42',
|
||||
}
|
||||
second = {
|
||||
'id': '2',
|
||||
@@ -614,6 +615,7 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
'duration': 10,
|
||||
'description': 'foo',
|
||||
'filesize': 5 * 1024,
|
||||
'playlist_id': '43',
|
||||
}
|
||||
videos = [first, second]
|
||||
|
||||
@@ -650,6 +652,10 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
f = match_filter_func('playlist_id = 42')
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
def test_playlist_items_selection(self):
|
||||
entries = [{
|
||||
'id': compat_str(i),
|
||||
|
||||
@@ -69,6 +69,7 @@ from youtube_dl.utils import (
|
||||
uppercase_escape,
|
||||
lowercase_escape,
|
||||
url_basename,
|
||||
base_url,
|
||||
urlencode_postdata,
|
||||
urshift,
|
||||
update_url_query,
|
||||
@@ -437,6 +438,13 @@ class TestUtil(unittest.TestCase):
|
||||
url_basename('http://media.w3.org/2010/05/sintel/trailer.mp4'),
|
||||
'trailer.mp4')
|
||||
|
||||
def test_base_url(self):
|
||||
self.assertEqual(base_url('http://foo.de/'), 'http://foo.de/')
|
||||
self.assertEqual(base_url('http://foo.de/bar'), 'http://foo.de/')
|
||||
self.assertEqual(base_url('http://foo.de/bar/'), 'http://foo.de/bar/')
|
||||
self.assertEqual(base_url('http://foo.de/bar/baz'), 'http://foo.de/bar/')
|
||||
self.assertEqual(base_url('http://foo.de/bar/baz?x=z/x/c'), 'http://foo.de/bar/')
|
||||
|
||||
def test_parse_age_limit(self):
|
||||
self.assertEqual(parse_age_limit(None), None)
|
||||
self.assertEqual(parse_age_limit(False), None)
|
||||
|
||||
@@ -1658,7 +1658,7 @@ class YoutubeDL(object):
|
||||
video_ext, audio_ext = audio.get('ext'), video.get('ext')
|
||||
if video_ext and audio_ext:
|
||||
COMPATIBLE_EXTS = (
|
||||
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v'),
|
||||
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'),
|
||||
('webm')
|
||||
)
|
||||
for exts in COMPATIBLE_EXTS:
|
||||
|
||||
@@ -7,6 +7,7 @@ from .http import HttpFD
|
||||
from .rtmp import RtmpFD
|
||||
from .dash import DashSegmentsFD
|
||||
from .rtsp import RtspFD
|
||||
from .ism import IsmFD
|
||||
from .external import (
|
||||
get_external_downloader,
|
||||
FFmpegFD,
|
||||
@@ -24,6 +25,7 @@ PROTOCOL_MAP = {
|
||||
'rtsp': RtspFD,
|
||||
'f4m': F4mFD,
|
||||
'http_dash_segments': DashSegmentsFD,
|
||||
'ism': IsmFD,
|
||||
}
|
||||
|
||||
|
||||
|
||||
273
youtube_dl/downloader/ism.py
Normal file
273
youtube_dl/downloader/ism.py
Normal file
@@ -0,0 +1,273 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import os
|
||||
import time
|
||||
import struct
|
||||
import binascii
|
||||
import io
|
||||
|
||||
from .fragment import FragmentFD
|
||||
from ..compat import compat_urllib_error
|
||||
from ..utils import (
|
||||
sanitize_open,
|
||||
encodeFilename,
|
||||
)
|
||||
|
||||
|
||||
u8 = struct.Struct(b'>B')
|
||||
u88 = struct.Struct(b'>Bx')
|
||||
u16 = struct.Struct(b'>H')
|
||||
u1616 = struct.Struct(b'>Hxx')
|
||||
u32 = struct.Struct(b'>I')
|
||||
u64 = struct.Struct(b'>Q')
|
||||
|
||||
s88 = struct.Struct(b'>bx')
|
||||
s16 = struct.Struct(b'>h')
|
||||
s1616 = struct.Struct(b'>hxx')
|
||||
s32 = struct.Struct(b'>i')
|
||||
|
||||
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
|
||||
|
||||
TRACK_ENABLED = 0x1
|
||||
TRACK_IN_MOVIE = 0x2
|
||||
TRACK_IN_PREVIEW = 0x4
|
||||
|
||||
SELF_CONTAINED = 0x1
|
||||
|
||||
|
||||
def box(box_type, payload):
|
||||
return u32.pack(8 + len(payload)) + box_type + payload
|
||||
|
||||
|
||||
def full_box(box_type, version, flags, payload):
|
||||
return box(box_type, u8.pack(version) + u32.pack(flags)[1:] + payload)
|
||||
|
||||
|
||||
def write_piff_header(stream, params):
|
||||
track_id = params['track_id']
|
||||
fourcc = params['fourcc']
|
||||
duration = params['duration']
|
||||
timescale = params.get('timescale', 10000000)
|
||||
language = params.get('language', 'und')
|
||||
height = params.get('height', 0)
|
||||
width = params.get('width', 0)
|
||||
is_audio = width == 0 and height == 0
|
||||
creation_time = modification_time = int(time.time())
|
||||
|
||||
ftyp_payload = b'isml' # major brand
|
||||
ftyp_payload += u32.pack(1) # minor version
|
||||
ftyp_payload += b'piff' + b'iso2' # compatible brands
|
||||
stream.write(box(b'ftyp', ftyp_payload)) # File Type Box
|
||||
|
||||
mvhd_payload = u64.pack(creation_time)
|
||||
mvhd_payload += u64.pack(modification_time)
|
||||
mvhd_payload += u32.pack(timescale)
|
||||
mvhd_payload += u64.pack(duration)
|
||||
mvhd_payload += s1616.pack(1) # rate
|
||||
mvhd_payload += s88.pack(1) # volume
|
||||
mvhd_payload += u16.pack(0) # reserved
|
||||
mvhd_payload += u32.pack(0) * 2 # reserved
|
||||
mvhd_payload += unity_matrix
|
||||
mvhd_payload += u32.pack(0) * 6 # pre defined
|
||||
mvhd_payload += u32.pack(0xffffffff) # next track id
|
||||
moov_payload = full_box(b'mvhd', 1, 0, mvhd_payload) # Movie Header Box
|
||||
|
||||
tkhd_payload = u64.pack(creation_time)
|
||||
tkhd_payload += u64.pack(modification_time)
|
||||
tkhd_payload += u32.pack(track_id) # track id
|
||||
tkhd_payload += u32.pack(0) # reserved
|
||||
tkhd_payload += u64.pack(duration)
|
||||
tkhd_payload += u32.pack(0) * 2 # reserved
|
||||
tkhd_payload += s16.pack(0) # layer
|
||||
tkhd_payload += s16.pack(0) # alternate group
|
||||
tkhd_payload += s88.pack(1 if is_audio else 0) # volume
|
||||
tkhd_payload += u16.pack(0) # reserved
|
||||
tkhd_payload += unity_matrix
|
||||
tkhd_payload += u1616.pack(width)
|
||||
tkhd_payload += u1616.pack(height)
|
||||
trak_payload = full_box(b'tkhd', 1, TRACK_ENABLED | TRACK_IN_MOVIE | TRACK_IN_PREVIEW, tkhd_payload) # Track Header Box
|
||||
|
||||
mdhd_payload = u64.pack(creation_time)
|
||||
mdhd_payload += u64.pack(modification_time)
|
||||
mdhd_payload += u32.pack(timescale)
|
||||
mdhd_payload += u64.pack(duration)
|
||||
mdhd_payload += u16.pack(((ord(language[0]) - 0x60) << 10) | ((ord(language[1]) - 0x60) << 5) | (ord(language[2]) - 0x60))
|
||||
mdhd_payload += u16.pack(0) # pre defined
|
||||
mdia_payload = full_box(b'mdhd', 1, 0, mdhd_payload) # Media Header Box
|
||||
|
||||
hdlr_payload = u32.pack(0) # pre defined
|
||||
hdlr_payload += b'soun' if is_audio else b'vide' # handler type
|
||||
hdlr_payload += u32.pack(0) * 3 # reserved
|
||||
hdlr_payload += (b'Sound' if is_audio else b'Video') + b'Handler\0' # name
|
||||
mdia_payload += full_box(b'hdlr', 0, 0, hdlr_payload) # Handler Reference Box
|
||||
|
||||
if is_audio:
|
||||
smhd_payload = s88.pack(0) # balance
|
||||
smhd_payload = u16.pack(0) # reserved
|
||||
media_header_box = full_box(b'smhd', 0, 0, smhd_payload) # Sound Media Header
|
||||
else:
|
||||
vmhd_payload = u16.pack(0) # graphics mode
|
||||
vmhd_payload += u16.pack(0) * 3 # opcolor
|
||||
media_header_box = full_box(b'vmhd', 0, 1, vmhd_payload) # Video Media Header
|
||||
minf_payload = media_header_box
|
||||
|
||||
dref_payload = u32.pack(1) # entry count
|
||||
dref_payload += full_box(b'url ', 0, SELF_CONTAINED, b'') # Data Entry URL Box
|
||||
dinf_payload = full_box(b'dref', 0, 0, dref_payload) # Data Reference Box
|
||||
minf_payload += box(b'dinf', dinf_payload) # Data Information Box
|
||||
|
||||
stsd_payload = u32.pack(1) # entry count
|
||||
|
||||
sample_entry_payload = u8.pack(0) * 6 # reserved
|
||||
sample_entry_payload += u16.pack(1) # data reference index
|
||||
if is_audio:
|
||||
sample_entry_payload += u32.pack(0) * 2 # reserved
|
||||
sample_entry_payload += u16.pack(params.get('channels', 2))
|
||||
sample_entry_payload += u16.pack(params.get('bits_per_sample', 16))
|
||||
sample_entry_payload += u16.pack(0) # pre defined
|
||||
sample_entry_payload += u16.pack(0) # reserved
|
||||
sample_entry_payload += u1616.pack(params['sampling_rate'])
|
||||
|
||||
if fourcc == 'AACL':
|
||||
smaple_entry_box = box(b'mp4a', sample_entry_payload)
|
||||
else:
|
||||
sample_entry_payload = sample_entry_payload
|
||||
sample_entry_payload += u16.pack(0) # pre defined
|
||||
sample_entry_payload += u16.pack(0) # reserved
|
||||
sample_entry_payload += u32.pack(0) * 3 # pre defined
|
||||
sample_entry_payload += u16.pack(width)
|
||||
sample_entry_payload += u16.pack(height)
|
||||
sample_entry_payload += u1616.pack(0x48) # horiz resolution 72 dpi
|
||||
sample_entry_payload += u1616.pack(0x48) # vert resolution 72 dpi
|
||||
sample_entry_payload += u32.pack(0) # reserved
|
||||
sample_entry_payload += u16.pack(1) # frame count
|
||||
sample_entry_payload += u8.pack(0) * 32 # compressor name
|
||||
sample_entry_payload += u16.pack(0x18) # depth
|
||||
sample_entry_payload += s16.pack(-1) # pre defined
|
||||
|
||||
codec_private_data = binascii.unhexlify(params['codec_private_data'])
|
||||
if fourcc in ('H264', 'AVC1'):
|
||||
sps, pps = codec_private_data.split(u32.pack(1))[1:]
|
||||
avcc_payload = u8.pack(1) # configuration version
|
||||
avcc_payload += sps[1] # avc profile indication
|
||||
avcc_payload += sps[2] # profile compatibility
|
||||
avcc_payload += sps[3] # avc level indication
|
||||
avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete represenation (1) + reserved (11111) + length size minus one
|
||||
avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001)
|
||||
avcc_payload += u16.pack(len(sps))
|
||||
avcc_payload += sps
|
||||
avcc_payload += u8.pack(1) # number of pps
|
||||
avcc_payload += u16.pack(len(pps))
|
||||
avcc_payload += pps
|
||||
sample_entry_payload += box(b'avcC', avcc_payload) # AVC Decoder Configuration Record
|
||||
smaple_entry_box = box(b'avc1', sample_entry_payload) # AVC Simple Entry
|
||||
stsd_payload += smaple_entry_box
|
||||
|
||||
stbl_payload = full_box(b'stsd', 0, 0, stsd_payload) # Sample Description Box
|
||||
|
||||
stts_payload = u32.pack(0) # entry count
|
||||
stbl_payload += full_box(b'stts', 0, 0, stts_payload) # Decoding Time to Sample Box
|
||||
|
||||
stsc_payload = u32.pack(0) # entry count
|
||||
stbl_payload += full_box(b'stsc', 0, 0, stsc_payload) # Sample To Chunk Box
|
||||
|
||||
stco_payload = u32.pack(0) # entry count
|
||||
stbl_payload += full_box(b'stco', 0, 0, stco_payload) # Chunk Offset Box
|
||||
|
||||
minf_payload += box(b'stbl', stbl_payload) # Sample Table Box
|
||||
|
||||
mdia_payload += box(b'minf', minf_payload) # Media Information Box
|
||||
|
||||
trak_payload += box(b'mdia', mdia_payload) # Media Box
|
||||
|
||||
moov_payload += box(b'trak', trak_payload) # Track Box
|
||||
|
||||
mehd_payload = u64.pack(duration)
|
||||
mvex_payload = full_box(b'mehd', 1, 0, mehd_payload) # Movie Extends Header Box
|
||||
|
||||
trex_payload = u32.pack(track_id) # track id
|
||||
trex_payload += u32.pack(1) # default sample description index
|
||||
trex_payload += u32.pack(0) # default sample duration
|
||||
trex_payload += u32.pack(0) # default sample size
|
||||
trex_payload += u32.pack(0) # default sample flags
|
||||
mvex_payload += full_box(b'trex', 0, 0, trex_payload) # Track Extends Box
|
||||
|
||||
moov_payload += box(b'mvex', mvex_payload) # Movie Extends Box
|
||||
stream.write(box(b'moov', moov_payload)) # Movie Box
|
||||
|
||||
|
||||
def extract_box_data(data, box_sequence):
|
||||
data_reader = io.BytesIO(data)
|
||||
while True:
|
||||
box_size = u32.unpack(data_reader.read(4))[0]
|
||||
box_type = data_reader.read(4)
|
||||
if box_type == box_sequence[0]:
|
||||
box_data = data_reader.read(box_size - 8)
|
||||
if len(box_sequence) == 1:
|
||||
return box_data
|
||||
return extract_box_data(box_data, box_sequence[1:])
|
||||
data_reader.seek(box_size - 8, 1)
|
||||
|
||||
|
||||
class IsmFD(FragmentFD):
|
||||
"""
|
||||
Download segments in a ISM manifest
|
||||
"""
|
||||
|
||||
FD_NAME = 'ism'
|
||||
|
||||
def real_download(self, filename, info_dict):
|
||||
segments = info_dict['fragments'][:1] if self.params.get(
|
||||
'test', False) else info_dict['fragments']
|
||||
|
||||
ctx = {
|
||||
'filename': filename,
|
||||
'total_frags': len(segments),
|
||||
}
|
||||
|
||||
self._prepare_and_start_frag_download(ctx)
|
||||
|
||||
segments_filenames = []
|
||||
|
||||
fragment_retries = self.params.get('fragment_retries', 0)
|
||||
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
|
||||
|
||||
track_written = False
|
||||
for i, segment in enumerate(segments):
|
||||
segment_url = segment['url']
|
||||
segment_name = 'Frag%d' % i
|
||||
target_filename = '%s-%s' % (ctx['tmpfilename'], segment_name)
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
try:
|
||||
success = ctx['dl'].download(target_filename, {'url': segment_url})
|
||||
if not success:
|
||||
return False
|
||||
down, target_sanitized = sanitize_open(target_filename, 'rb')
|
||||
down_data = down.read()
|
||||
if not track_written:
|
||||
tfhd_data = extract_box_data(down_data, [b'moof', b'traf', b'tfhd'])
|
||||
info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0]
|
||||
write_piff_header(ctx['dest_stream'], info_dict['_download_params'])
|
||||
track_written = True
|
||||
ctx['dest_stream'].write(down_data)
|
||||
down.close()
|
||||
segments_filenames.append(target_sanitized)
|
||||
break
|
||||
except compat_urllib_error.HTTPError as err:
|
||||
count += 1
|
||||
if count <= fragment_retries:
|
||||
self.report_retry_fragment(err, segment_name, count, fragment_retries)
|
||||
if count > fragment_retries:
|
||||
if skip_unavailable_fragments:
|
||||
self.report_skip_fragment(segment_name)
|
||||
continue
|
||||
self.report_error('giving up after %s fragment retries' % fragment_retries)
|
||||
return False
|
||||
|
||||
self._finish_frag_download(ctx)
|
||||
|
||||
for segment_file in segments_filenames:
|
||||
os.remove(encodeFilename(segment_file))
|
||||
|
||||
return True
|
||||
@@ -26,6 +26,11 @@ MSO_INFO = {
|
||||
'username_field': 'UserName',
|
||||
'password_field': 'UserPassword',
|
||||
},
|
||||
'Comcast_SSO': {
|
||||
'name': 'Comcast XFINITY',
|
||||
'username_field': 'user',
|
||||
'password_field': 'passwd',
|
||||
},
|
||||
'thr030': {
|
||||
'name': '3 Rivers Communications'
|
||||
},
|
||||
@@ -1364,14 +1369,53 @@ class AdobePassIE(InfoExtractor):
|
||||
'domain_name': 'adobe.com',
|
||||
'redirect_url': url,
|
||||
})
|
||||
provider_login_page_res = post_form(
|
||||
provider_redirect_page_res, 'Downloading Provider Login Page')
|
||||
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
|
||||
mso_info.get('username_field', 'username'): username,
|
||||
mso_info.get('password_field', 'password'): password,
|
||||
})
|
||||
if mso_id != 'Rogers':
|
||||
post_form(mvpd_confirm_page_res, 'Confirming Login')
|
||||
|
||||
if mso_id == 'Comcast_SSO':
|
||||
# Comcast page flow varies by video site and whether you
|
||||
# are on Comcast's network.
|
||||
provider_redirect_page, urlh = provider_redirect_page_res
|
||||
# Check for Comcast auto login
|
||||
if 'automatically signing you in' in provider_redirect_page:
|
||||
oauth_redirect_url = self._html_search_regex(
|
||||
r'window\.location\s*=\s*[\'"]([^\'"]+)',
|
||||
provider_redirect_page, 'oauth redirect')
|
||||
# Just need to process the request. No useful data comes back
|
||||
self._download_webpage(
|
||||
oauth_redirect_url, video_id, 'Confirming auto login')
|
||||
else:
|
||||
if '<form name="signin"' in provider_redirect_page:
|
||||
# already have the form, just fill it
|
||||
provider_login_page_res = provider_redirect_page_res
|
||||
elif 'http-equiv="refresh"' in provider_redirect_page:
|
||||
# redirects to the login page
|
||||
oauth_redirect_url = self._html_search_regex(
|
||||
r'content="0;\s*url=([^\'"]+)',
|
||||
provider_redirect_page, 'meta refresh redirect')
|
||||
provider_login_page_res = self._download_webpage_handle(
|
||||
oauth_redirect_url,
|
||||
video_id, 'Downloading Provider Login Page')
|
||||
else:
|
||||
provider_login_page_res = post_form(
|
||||
provider_redirect_page_res, 'Downloading Provider Login Page')
|
||||
|
||||
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
|
||||
mso_info.get('username_field', 'username'): username,
|
||||
mso_info.get('password_field', 'password'): password,
|
||||
})
|
||||
mvpd_confirm_page, urlh = mvpd_confirm_page_res
|
||||
if '<button class="submit" value="Resume">Resume</button>' in mvpd_confirm_page:
|
||||
post_form(mvpd_confirm_page_res, 'Confirming Login')
|
||||
|
||||
else:
|
||||
# Normal, non-Comcast flow
|
||||
provider_login_page_res = post_form(
|
||||
provider_redirect_page_res, 'Downloading Provider Login Page')
|
||||
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
|
||||
mso_info.get('username_field', 'username'): username,
|
||||
mso_info.get('password_field', 'password'): password,
|
||||
})
|
||||
if mso_id != 'Rogers':
|
||||
post_form(mvpd_confirm_page_res, 'Confirming Login')
|
||||
|
||||
session = self._download_webpage(
|
||||
self._SERVICE_PROVIDER_TEMPLATE % 'session', video_id,
|
||||
|
||||
@@ -96,6 +96,27 @@ class AdultSwimIE(TurnerBaseIE):
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['Unable to download f4m manifest'],
|
||||
}, {
|
||||
'url': 'http://www.adultswim.com/videos/toonami/friday-october-14th-2016/',
|
||||
'info_dict': {
|
||||
'id': 'eYiLsKVgQ6qTC6agD67Sig',
|
||||
'title': 'Toonami - Friday, October 14th, 2016',
|
||||
'description': 'md5:99892c96ffc85e159a428de85c30acde',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': '',
|
||||
'info_dict': {
|
||||
'id': 'eYiLsKVgQ6qTC6agD67Sig',
|
||||
'ext': 'mp4',
|
||||
'title': 'Toonami - Friday, October 14th, 2016',
|
||||
'description': 'md5:99892c96ffc85e159a428de85c30acde',
|
||||
},
|
||||
}],
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['Unable to download f4m manifest'],
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
@@ -163,6 +184,8 @@ class AdultSwimIE(TurnerBaseIE):
|
||||
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
|
||||
elif video_info.get('videoPlaybackID'):
|
||||
segment_ids = [video_info['videoPlaybackID']]
|
||||
elif video_info.get('id'):
|
||||
segment_ids = [video_info['id']]
|
||||
else:
|
||||
if video_info.get('auth') is True:
|
||||
raise ExtractorError(
|
||||
|
||||
@@ -174,11 +174,17 @@ class ARDMediathekIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
if '>Der gewünschte Beitrag ist nicht mehr verfügbar.<' in webpage:
|
||||
raise ExtractorError('Video %s is no longer available' % video_id, expected=True)
|
||||
ERRORS = (
|
||||
('>Leider liegt eine Störung vor.', 'Video %s is unavailable'),
|
||||
('>Der gewünschte Beitrag ist nicht mehr verfügbar.<',
|
||||
'Video %s is no longer available'),
|
||||
('Diese Sendung ist für Jugendliche unter 12 Jahren nicht geeignet. Der Clip ist deshalb nur von 20 bis 6 Uhr verfügbar.',
|
||||
'This program is only suitable for those aged 12 and older. Video %s is therefore only available between 8 pm and 6 am.'),
|
||||
)
|
||||
|
||||
if 'Diese Sendung ist für Jugendliche unter 12 Jahren nicht geeignet. Der Clip ist deshalb nur von 20 bis 6 Uhr verfügbar.' in webpage:
|
||||
raise ExtractorError('This program is only suitable for those aged 12 and older. Video %s is therefore only available between 20 pm and 6 am.' % video_id, expected=True)
|
||||
for pattern, message in ERRORS:
|
||||
if pattern in webpage:
|
||||
raise ExtractorError(message % video_id, expected=True)
|
||||
|
||||
if re.search(r'[\?&]rss($|[=&])', url):
|
||||
doc = compat_etree_fromstring(webpage.encode('utf-8'))
|
||||
|
||||
@@ -46,19 +46,19 @@ class BeegIE(InfoExtractor):
|
||||
self._proto_relative_url(cpl_url), video_id,
|
||||
'Downloading cpl JS', fatal=False)
|
||||
if cpl:
|
||||
beeg_version = self._search_regex(
|
||||
r'beeg_version\s*=\s*(\d+)', cpl,
|
||||
'beeg version', default=None) or self._search_regex(
|
||||
beeg_version = int_or_none(self._search_regex(
|
||||
r'beeg_version\s*=\s*([^\b]+)', cpl,
|
||||
'beeg version', default=None)) or self._search_regex(
|
||||
r'/(\d+)\.js', cpl_url, 'beeg version', default=None)
|
||||
beeg_salt = self._search_regex(
|
||||
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg beeg_salt',
|
||||
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg salt',
|
||||
default=None, group='beeg_salt')
|
||||
|
||||
beeg_version = beeg_version or '1750'
|
||||
beeg_salt = beeg_salt or 'MIDtGaw96f0N1kMMAM1DE46EC9pmFr'
|
||||
beeg_version = beeg_version or '2000'
|
||||
beeg_salt = beeg_salt or 'pmweAkq8lAYKdfWcFCUj0yoVgoPlinamH5UE1CB3H'
|
||||
|
||||
video = self._download_json(
|
||||
'http://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
|
||||
'https://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
|
||||
video_id)
|
||||
|
||||
def split(o, e):
|
||||
|
||||
@@ -30,6 +30,7 @@ from ..downloader.f4m import remove_encrypted_media
|
||||
from ..utils import (
|
||||
NO_DEFAULT,
|
||||
age_restricted,
|
||||
base_url,
|
||||
bug_reports_message,
|
||||
clean_html,
|
||||
compiled_regex_type,
|
||||
@@ -235,7 +236,7 @@ class InfoExtractor(object):
|
||||
chapter_id: Id of the chapter the video belongs to, as a unicode string.
|
||||
|
||||
The following fields should only be used when the video is an episode of some
|
||||
series or programme:
|
||||
series, programme or podcast:
|
||||
|
||||
series: Title of the series or programme the video episode belongs to.
|
||||
season: Title of the season the video episode belongs to.
|
||||
@@ -1100,6 +1101,13 @@ class InfoExtractor(object):
|
||||
manifest, ['{http://ns.adobe.com/f4m/1.0}bootstrapInfo', '{http://ns.adobe.com/f4m/2.0}bootstrapInfo'],
|
||||
'bootstrap info', default=None)
|
||||
|
||||
vcodec = None
|
||||
mime_type = xpath_text(
|
||||
manifest, ['{http://ns.adobe.com/f4m/1.0}mimeType', '{http://ns.adobe.com/f4m/2.0}mimeType'],
|
||||
'base URL', default=None)
|
||||
if mime_type and mime_type.startswith('audio/'):
|
||||
vcodec = 'none'
|
||||
|
||||
for i, media_el in enumerate(media_nodes):
|
||||
tbr = int_or_none(media_el.attrib.get('bitrate'))
|
||||
width = int_or_none(media_el.attrib.get('width'))
|
||||
@@ -1140,6 +1148,7 @@ class InfoExtractor(object):
|
||||
'width': f.get('width') or width,
|
||||
'height': f.get('height') or height,
|
||||
'format_id': f.get('format_id') if not tbr else format_id,
|
||||
'vcodec': vcodec,
|
||||
})
|
||||
formats.extend(f4m_formats)
|
||||
continue
|
||||
@@ -1156,6 +1165,7 @@ class InfoExtractor(object):
|
||||
'tbr': tbr,
|
||||
'width': width,
|
||||
'height': height,
|
||||
'vcodec': vcodec,
|
||||
'preference': preference,
|
||||
})
|
||||
return formats
|
||||
@@ -1530,7 +1540,7 @@ class InfoExtractor(object):
|
||||
if res is False:
|
||||
return []
|
||||
mpd, urlh = res
|
||||
mpd_base_url = re.match(r'https?://.+/', urlh.geturl()).group()
|
||||
mpd_base_url = base_url(urlh.geturl())
|
||||
|
||||
return self._parse_mpd_formats(
|
||||
compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url,
|
||||
@@ -1771,6 +1781,105 @@ class InfoExtractor(object):
|
||||
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
|
||||
return formats
|
||||
|
||||
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
|
||||
res = self._download_webpage_handle(
|
||||
ism_url, video_id,
|
||||
note=note or 'Downloading ISM manifest',
|
||||
errnote=errnote or 'Failed to download ISM manifest',
|
||||
fatal=fatal)
|
||||
if res is False:
|
||||
return []
|
||||
ism, urlh = res
|
||||
|
||||
return self._parse_ism_formats(
|
||||
compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id)
|
||||
|
||||
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
|
||||
if ism_doc.get('IsLive') == 'TRUE' or ism_doc.find('Protection') is not None:
|
||||
return []
|
||||
|
||||
duration = int(ism_doc.attrib['Duration'])
|
||||
timescale = int_or_none(ism_doc.get('TimeScale')) or 10000000
|
||||
|
||||
formats = []
|
||||
for stream in ism_doc.findall('StreamIndex'):
|
||||
stream_type = stream.get('Type')
|
||||
if stream_type not in ('video', 'audio'):
|
||||
continue
|
||||
url_pattern = stream.attrib['Url']
|
||||
stream_timescale = int_or_none(stream.get('TimeScale')) or timescale
|
||||
stream_name = stream.get('Name')
|
||||
for track in stream.findall('QualityLevel'):
|
||||
fourcc = track.get('FourCC')
|
||||
# TODO: add support for WVC1 and WMAP
|
||||
if fourcc not in ('H264', 'AVC1', 'AACL'):
|
||||
self.report_warning('%s is not a supported codec' % fourcc)
|
||||
continue
|
||||
tbr = int(track.attrib['Bitrate']) // 1000
|
||||
width = int_or_none(track.get('MaxWidth'))
|
||||
height = int_or_none(track.get('MaxHeight'))
|
||||
sampling_rate = int_or_none(track.get('SamplingRate'))
|
||||
|
||||
track_url_pattern = re.sub(r'{[Bb]itrate}', track.attrib['Bitrate'], url_pattern)
|
||||
track_url_pattern = compat_urlparse.urljoin(ism_url, track_url_pattern)
|
||||
|
||||
fragments = []
|
||||
fragment_ctx = {
|
||||
'time': 0,
|
||||
}
|
||||
stream_fragments = stream.findall('c')
|
||||
for stream_fragment_index, stream_fragment in enumerate(stream_fragments):
|
||||
fragment_ctx['time'] = int_or_none(stream_fragment.get('t')) or fragment_ctx['time']
|
||||
fragment_repeat = int_or_none(stream_fragment.get('r')) or 1
|
||||
fragment_ctx['duration'] = int_or_none(stream_fragment.get('d'))
|
||||
if not fragment_ctx['duration']:
|
||||
try:
|
||||
next_fragment_time = int(stream_fragment[stream_fragment_index + 1].attrib['t'])
|
||||
except IndexError:
|
||||
next_fragment_time = duration
|
||||
fragment_ctx['duration'] = (next_fragment_time - fragment_ctx['time']) / fragment_repeat
|
||||
for _ in range(fragment_repeat):
|
||||
fragments.append({
|
||||
'url': re.sub(r'{start[ _]time}', compat_str(fragment_ctx['time']), track_url_pattern),
|
||||
'duration': fragment_ctx['duration'] / stream_timescale,
|
||||
})
|
||||
fragment_ctx['time'] += fragment_ctx['duration']
|
||||
|
||||
format_id = []
|
||||
if ism_id:
|
||||
format_id.append(ism_id)
|
||||
if stream_name:
|
||||
format_id.append(stream_name)
|
||||
format_id.append(compat_str(tbr))
|
||||
|
||||
formats.append({
|
||||
'format_id': '-'.join(format_id),
|
||||
'url': ism_url,
|
||||
'manifest_url': ism_url,
|
||||
'ext': 'ismv' if stream_type == 'video' else 'isma',
|
||||
'width': width,
|
||||
'height': height,
|
||||
'tbr': tbr,
|
||||
'asr': sampling_rate,
|
||||
'vcodec': 'none' if stream_type == 'audio' else fourcc,
|
||||
'acodec': 'none' if stream_type == 'video' else fourcc,
|
||||
'protocol': 'ism',
|
||||
'fragments': fragments,
|
||||
'_download_params': {
|
||||
'duration': duration,
|
||||
'timescale': stream_timescale,
|
||||
'width': width or 0,
|
||||
'height': height or 0,
|
||||
'fourcc': fourcc,
|
||||
'codec_private_data': track.get('CodecPrivateData'),
|
||||
'sampling_rate': sampling_rate,
|
||||
'channels': int_or_none(track.get('Channels', 2)),
|
||||
'bits_per_sample': int_or_none(track.get('BitsPerSample', 16)),
|
||||
'nal_unit_length_field': int_or_none(track.get('NALUnitLengthField', 4)),
|
||||
},
|
||||
})
|
||||
return formats
|
||||
|
||||
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8'):
|
||||
def absolute_url(video_url):
|
||||
return compat_urlparse.urljoin(base_url, video_url)
|
||||
@@ -1875,11 +1984,11 @@ class InfoExtractor(object):
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
http_base_url + '/manifest.f4m',
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
if 'dash' not in skip_protocols:
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
http_base_url + '/manifest.mpd',
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
if re.search(r'(?:/smil:|\.smil)', url_base):
|
||||
if 'dash' not in skip_protocols:
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
http_base_url + '/manifest.mpd',
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
if 'smil' not in skip_protocols:
|
||||
rtmp_formats = self._extract_smil_formats(
|
||||
http_base_url + '/jwplayer.smil',
|
||||
|
||||
@@ -9,7 +9,7 @@ from ..utils import (
|
||||
|
||||
class DotsubIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?dotsub\.com/view/(?P<id>[^/]+)'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'https://dotsub.com/view/9c63db2a-fa95-4838-8e6e-13deafe47f09',
|
||||
'md5': '21c7ff600f545358134fea762a6d42b6',
|
||||
'info_dict': {
|
||||
@@ -24,7 +24,24 @@ class DotsubIE(InfoExtractor):
|
||||
'upload_date': '20131130',
|
||||
'view_count': int,
|
||||
}
|
||||
}
|
||||
}, {
|
||||
'url': 'https://dotsub.com/view/747bcf58-bd59-45b7-8c8c-ac312d084ee6',
|
||||
'md5': '2bb4a83896434d5c26be868c609429a3',
|
||||
'info_dict': {
|
||||
'id': '168006778',
|
||||
'ext': 'mp4',
|
||||
'title': 'Apartments and flats in Raipur the white symphony',
|
||||
'description': 'md5:784d0639e6b7d1bc29530878508e38fe',
|
||||
'thumbnail': 're:^https?://dotsub.com/media/747bcf58-bd59-45b7-8c8c-ac312d084ee6/p',
|
||||
'duration': 290,
|
||||
'timestamp': 1476767794.2809999,
|
||||
'upload_date': '20160525',
|
||||
'uploader': 'parthivi001',
|
||||
'uploader_id': 'user52596202',
|
||||
'view_count': int,
|
||||
},
|
||||
'add_ie': ['Vimeo'],
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
@@ -37,12 +54,23 @@ class DotsubIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video_url = self._search_regex(
|
||||
[r'<source[^>]+src="([^"]+)"', r'"file"\s*:\s*\'([^\']+)'],
|
||||
webpage, 'video url')
|
||||
webpage, 'video url', default=None)
|
||||
info_dict = {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'ext': 'flv',
|
||||
}
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'ext': 'flv',
|
||||
if not video_url:
|
||||
setup_data = self._parse_json(self._html_search_regex(
|
||||
r'(?s)data-setup=([\'"])(?P<content>(?!\1).+?)\1',
|
||||
webpage, 'setup data', group='content'), video_id)
|
||||
info_dict = {
|
||||
'_type': 'url_transparent',
|
||||
'url': setup_data['src'],
|
||||
}
|
||||
|
||||
info_dict.update({
|
||||
'title': info['title'],
|
||||
'description': info.get('description'),
|
||||
'thumbnail': info.get('screenshotURI'),
|
||||
@@ -50,4 +78,6 @@ class DotsubIE(InfoExtractor):
|
||||
'uploader': info.get('user'),
|
||||
'timestamp': float_or_none(info.get('dateCreated'), 1000),
|
||||
'view_count': int_or_none(info.get('numberOfViews')),
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
||||
@@ -408,6 +408,10 @@ from .ivi import (
|
||||
from .ivideon import IvideonIE
|
||||
from .iwara import IwaraIE
|
||||
from .izlesene import IzleseneIE
|
||||
from .jamendo import (
|
||||
JamendoIE,
|
||||
JamendoAlbumIE,
|
||||
)
|
||||
from .jeuxvideo import JeuxVideoIE
|
||||
from .jove import JoveIE
|
||||
from .jwplatform import JWPlatformIE
|
||||
@@ -592,6 +596,7 @@ from .nhl import (
|
||||
from .nick import (
|
||||
NickIE,
|
||||
NickDeIE,
|
||||
NickNightIE,
|
||||
)
|
||||
from .niconico import NiconicoIE, NiconicoPlaylistIE
|
||||
from .ninecninemedia import (
|
||||
@@ -601,6 +606,7 @@ from .ninecninemedia import (
|
||||
from .ninegag import NineGagIE
|
||||
from .ninenow import NineNowIE
|
||||
from .nintendo import NintendoIE
|
||||
from .nobelprize import NobelPrizeIE
|
||||
from .noco import NocoIE
|
||||
from .normalboots import NormalbootsIE
|
||||
from .nosvideo import NosVideoIE
|
||||
@@ -667,6 +673,7 @@ from .orf import (
|
||||
ORFFM4IE,
|
||||
ORFIPTVIE,
|
||||
)
|
||||
from .pandatv import PandaTVIE
|
||||
from .pandoratv import PandoraTVIE
|
||||
from .parliamentliveuk import ParliamentLiveUKIE
|
||||
from .patreon import PatreonIE
|
||||
@@ -740,6 +747,10 @@ from .rbmaradio import RBMARadioIE
|
||||
from .rds import RDSIE
|
||||
from .redtube import RedTubeIE
|
||||
from .regiotv import RegioTVIE
|
||||
from .rentv import (
|
||||
RENTVIE,
|
||||
RENTVArticleIE,
|
||||
)
|
||||
from .restudy import RestudyIE
|
||||
from .reuters import ReutersIE
|
||||
from .reverbnation import ReverbNationIE
|
||||
@@ -796,7 +807,10 @@ from .sendtonews import SendtoNewsIE
|
||||
from .servingsys import ServingSysIE
|
||||
from .sexu import SexuIE
|
||||
from .shahid import ShahidIE
|
||||
from .shared import SharedIE
|
||||
from .shared import (
|
||||
SharedIE,
|
||||
VivoIE,
|
||||
)
|
||||
from .sharesix import ShareSixIE
|
||||
from .sina import SinaIE
|
||||
from .sixplay import SixPlayIE
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
import socket
|
||||
|
||||
@@ -100,7 +99,8 @@ class FacebookIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': '"What are you doing running in the snow?"',
|
||||
'uploader': 'FailArmy',
|
||||
}
|
||||
},
|
||||
'skip': 'Video gone',
|
||||
}, {
|
||||
'url': 'https://m.facebook.com/story.php?story_fbid=1035862816472149&id=116132035111903',
|
||||
'md5': '1deb90b6ac27f7efcf6d747c8a27f5e3',
|
||||
@@ -110,6 +110,7 @@ class FacebookIE(InfoExtractor):
|
||||
'title': 'What the Flock Is Going On In New Zealand Credit: ViralHog',
|
||||
'uploader': 'S. Saint',
|
||||
},
|
||||
'skip': 'Video gone',
|
||||
}, {
|
||||
'note': 'swf params escaped',
|
||||
'url': 'https://www.facebook.com/barackobama/posts/10153664894881749',
|
||||
@@ -119,6 +120,18 @@ class FacebookIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Facebook video #10153664894881749',
|
||||
},
|
||||
}, {
|
||||
# have 1080P, but only up to 720p in swf params
|
||||
'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
|
||||
'md5': '0d9813160b146b3bc8744e006027fcc6',
|
||||
'info_dict': {
|
||||
'id': '10155529876156509',
|
||||
'ext': 'mp4',
|
||||
'title': 'Holocaust survivor becomes US citizen',
|
||||
'timestamp': 1477818095,
|
||||
'upload_date': '20161030',
|
||||
'uploader': 'CNN',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.facebook.com/video.php?v=10204634152394104',
|
||||
'only_matching': True,
|
||||
@@ -227,43 +240,13 @@ class FacebookIE(InfoExtractor):
|
||||
|
||||
video_data = None
|
||||
|
||||
BEFORE = '{swf.addParam(param[0], param[1]);});'
|
||||
AFTER = '.forEach(function(variable) {swf.addVariable(variable[0], variable[1]);});'
|
||||
PATTERN = re.escape(BEFORE) + '(?:\n|\\\\n)(.*?)' + re.escape(AFTER)
|
||||
|
||||
for m in re.findall(PATTERN, webpage):
|
||||
swf_params = m.replace('\\\\', '\\').replace('\\"', '"')
|
||||
data = dict(json.loads(swf_params))
|
||||
params_raw = compat_urllib_parse_unquote(data['params'])
|
||||
video_data_candidate = json.loads(params_raw)['video_data']
|
||||
for _, f in video_data_candidate.items():
|
||||
if not f:
|
||||
continue
|
||||
if isinstance(f, dict):
|
||||
f = [f]
|
||||
if not isinstance(f, list):
|
||||
continue
|
||||
if f[0].get('video_id') == video_id:
|
||||
video_data = video_data_candidate
|
||||
break
|
||||
if video_data:
|
||||
server_js_data = self._parse_json(self._search_regex(
|
||||
r'handleServerJS\(({.+})(?:\);|,")', webpage, 'server js data', default='{}'), video_id)
|
||||
for item in server_js_data.get('instances', []):
|
||||
if item[1][0] == 'VideoConfig':
|
||||
video_data = item[2][0]['videoData']
|
||||
break
|
||||
|
||||
def video_data_list2dict(video_data):
|
||||
ret = {}
|
||||
for item in video_data:
|
||||
format_id = item['stream_type']
|
||||
ret.setdefault(format_id, []).append(item)
|
||||
return ret
|
||||
|
||||
if not video_data:
|
||||
server_js_data = self._parse_json(self._search_regex(
|
||||
r'handleServerJS\(({.+})(?:\);|,")', webpage, 'server js data', default='{}'), video_id)
|
||||
for item in server_js_data.get('instances', []):
|
||||
if item[1][0] == 'VideoConfig':
|
||||
video_data = video_data_list2dict(item[2][0]['videoData'])
|
||||
break
|
||||
|
||||
if not video_data:
|
||||
if not fatal_if_no_video:
|
||||
return webpage, False
|
||||
@@ -276,7 +259,8 @@ class FacebookIE(InfoExtractor):
|
||||
raise ExtractorError('Cannot parse data')
|
||||
|
||||
formats = []
|
||||
for format_id, f in video_data.items():
|
||||
for f in video_data:
|
||||
format_id = f['stream_type']
|
||||
if f and isinstance(f, dict):
|
||||
f = [f]
|
||||
if not f or not isinstance(f, list):
|
||||
|
||||
@@ -1208,20 +1208,6 @@ class GenericIE(InfoExtractor):
|
||||
'duration': 51690,
|
||||
},
|
||||
},
|
||||
# JWPlayer with M3U8
|
||||
{
|
||||
'url': 'http://ren.tv/novosti/2015-09-25/sluchaynyy-prohozhiy-poymal-avtougonshchika-v-murmanske-video',
|
||||
'info_dict': {
|
||||
'id': 'playlist',
|
||||
'ext': 'mp4',
|
||||
'title': 'Случайный прохожий поймал автоугонщика в Мурманске. ВИДЕО | РЕН ТВ',
|
||||
'uploader': 'ren.tv',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 downloads
|
||||
'skip_download': True,
|
||||
}
|
||||
},
|
||||
# Brightcove embed, with no valid 'renditions' but valid 'IOSRenditions'
|
||||
# This video can't be played in browsers if Flash disabled and UA set to iPhone, which is actually a false alarm
|
||||
{
|
||||
|
||||
@@ -4,9 +4,6 @@ import itertools
|
||||
import re
|
||||
|
||||
from .common import SearchInfoExtractor
|
||||
from ..compat import (
|
||||
compat_urllib_parse,
|
||||
)
|
||||
|
||||
|
||||
class GoogleSearchIE(SearchInfoExtractor):
|
||||
@@ -34,13 +31,16 @@ class GoogleSearchIE(SearchInfoExtractor):
|
||||
}
|
||||
|
||||
for pagenum in itertools.count():
|
||||
result_url = (
|
||||
'http://www.google.com/search?tbm=vid&q=%s&start=%s&hl=en'
|
||||
% (compat_urllib_parse.quote_plus(query), pagenum * 10))
|
||||
|
||||
webpage = self._download_webpage(
|
||||
result_url, 'gvsearch:' + query,
|
||||
note='Downloading result page ' + str(pagenum + 1))
|
||||
'http://www.google.com/search',
|
||||
'gvsearch:' + query,
|
||||
note='Downloading result page %s' % (pagenum + 1),
|
||||
query={
|
||||
'tbm': 'vid',
|
||||
'q': query,
|
||||
'start': pagenum * 10,
|
||||
'hl': 'en',
|
||||
})
|
||||
|
||||
for hit_idx, mobj in enumerate(re.finditer(
|
||||
r'<h3 class="r"><a href="([^"]+)"', webpage)):
|
||||
|
||||
@@ -1,8 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
@@ -14,29 +12,24 @@ class HornBunnyIE(InfoExtractor):
|
||||
_VALID_URL = r'http?://(?:www\.)?hornbunny\.com/videos/(?P<title_dash>[a-z-]+)-(?P<id>\d+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://hornbunny.com/videos/panty-slut-jerk-off-instruction-5227.html',
|
||||
'md5': '95e40865aedd08eff60272b704852ad7',
|
||||
'md5': 'e20fd862d1894b67564c96f180f43924',
|
||||
'info_dict': {
|
||||
'id': '5227',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'panty slut jerk off instruction',
|
||||
'duration': 550,
|
||||
'age_limit': 18,
|
||||
'view_count': int,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, note='Downloading initial webpage')
|
||||
title = self._html_search_regex(
|
||||
r'class="title">(.*?)</h2>', webpage, 'title')
|
||||
redirect_url = self._html_search_regex(
|
||||
r'pg&settings=(.*?)\|0"\);', webpage, 'title')
|
||||
webpage2 = self._download_webpage(redirect_url, video_id)
|
||||
video_url = self._html_search_regex(
|
||||
r'flvMask:(.*?);', webpage2, 'video_url')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
title = self._og_search_title(webpage)
|
||||
info_dict = self._parse_html5_media_entries(url, webpage, video_id)[0]
|
||||
|
||||
duration = parse_duration(self._search_regex(
|
||||
r'<strong>Runtime:</strong>\s*([0-9:]+)</div>',
|
||||
@@ -45,12 +38,12 @@ class HornBunnyIE(InfoExtractor):
|
||||
r'<strong>Views:</strong>\s*(\d+)</div>',
|
||||
webpage, 'view count', fatal=False))
|
||||
|
||||
return {
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'ext': 'flv',
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'age_limit': 18,
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
||||
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class ImgurIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:i\.)?imgur\.com/(?:(?:gallery|topic/[^/]+)/)?(?P<id>[a-zA-Z0-9]{6,})(?:[/?#&]+|\.[a-z]+)?$'
|
||||
_VALID_URL = r'https?://(?:i\.)?imgur\.com/(?:(?:gallery|(?:topic|r)/[^/]+)/)?(?P<id>[a-zA-Z0-9]{6,})(?:[/?#&]+|\.[a-z]+)?$'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://i.imgur.com/A61SaA1.gifv',
|
||||
@@ -43,6 +43,9 @@ class ImgurIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://imgur.com/topic/Funny/N8rOudd',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://imgur.com/r/aww/VQcQPhM',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
107
youtube_dl/extractor/jamendo.py
Normal file
107
youtube_dl/extractor/jamendo.py
Normal file
@@ -0,0 +1,107 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from ..compat import compat_urlparse
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class JamendoIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/track/(?P<id>[0-9]+)/(?P<display_id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.jamendo.com/track/196219/stories-from-emona-i',
|
||||
'md5': '6e9e82ed6db98678f171c25a8ed09ffd',
|
||||
'info_dict': {
|
||||
'id': '196219',
|
||||
'display_id': 'stories-from-emona-i',
|
||||
'ext': 'flac',
|
||||
'title': 'Stories from Emona I',
|
||||
'thumbnail': 're:^https?://.*\.jpg'
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._VALID_URL_RE.match(url)
|
||||
track_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id')
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
title = self._html_search_meta('name', webpage, 'title')
|
||||
|
||||
formats = [{
|
||||
'url': 'https://%s.jamendo.com/?trackid=%s&format=%s&from=app-97dab294'
|
||||
% (sub_domain, track_id, format_id),
|
||||
'format_id': format_id,
|
||||
'ext': ext,
|
||||
'quality': quality,
|
||||
} for quality, (format_id, sub_domain, ext) in enumerate((
|
||||
('mp31', 'mp3l', 'mp3'),
|
||||
('mp32', 'mp3d', 'mp3'),
|
||||
('ogg1', 'ogg', 'ogg'),
|
||||
('flac', 'flac', 'flac'),
|
||||
))]
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnail = self._html_search_meta(
|
||||
'image', webpage, 'thumbnail', fatal=False)
|
||||
|
||||
return {
|
||||
'id': track_id,
|
||||
'display_id': display_id,
|
||||
'thumbnail': thumbnail,
|
||||
'title': title,
|
||||
'formats': formats
|
||||
}
|
||||
|
||||
|
||||
class JamendoAlbumIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/album/(?P<id>[0-9]+)/(?P<display_id>[\w-]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.jamendo.com/album/121486/duck-on-cover',
|
||||
'info_dict': {
|
||||
'id': '121486',
|
||||
'title': 'Duck On Cover'
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': 'e1a2fcb42bda30dfac990212924149a8',
|
||||
'info_dict': {
|
||||
'id': '1032333',
|
||||
'ext': 'flac',
|
||||
'title': 'Warmachine'
|
||||
}
|
||||
}, {
|
||||
'md5': '1f358d7b2f98edfe90fd55dac0799d50',
|
||||
'info_dict': {
|
||||
'id': '1032330',
|
||||
'ext': 'flac',
|
||||
'title': 'Without Your Ghost'
|
||||
}
|
||||
}],
|
||||
'params': {
|
||||
'playlistend': 2
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._VALID_URL_RE.match(url)
|
||||
album_id = mobj.group('id')
|
||||
|
||||
webpage = self._download_webpage(url, mobj.group('display_id'))
|
||||
|
||||
title = self._html_search_meta('name', webpage, 'title')
|
||||
|
||||
entries = [
|
||||
self.url_result(
|
||||
compat_urlparse.urljoin(url, m.group('path')),
|
||||
ie=JamendoIE.ie_key(),
|
||||
video_id=self._search_regex(
|
||||
r'/track/(\d+)', m.group('path'),
|
||||
'track id', default=None))
|
||||
for m in re.finditer(
|
||||
r'<a[^>]+href=(["\'])(?P<path>(?:(?!\1).)+)\1[^>]+class=["\'][^>]*js-trackrow-albumpage-link',
|
||||
webpage)
|
||||
]
|
||||
|
||||
return self.playlist_result(entries, album_id, title)
|
||||
@@ -2,7 +2,6 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
@@ -52,8 +51,8 @@ class LiTVIE(InfoExtractor):
|
||||
'skip': 'Georestricted to Taiwan',
|
||||
}]
|
||||
|
||||
def _extract_playlist(self, season_list, video_id, vod_data, view_data, prompt=True):
|
||||
episode_title = view_data['title']
|
||||
def _extract_playlist(self, season_list, video_id, program_info, prompt=True):
|
||||
episode_title = program_info['title']
|
||||
content_id = season_list['contentId']
|
||||
|
||||
if prompt:
|
||||
@@ -61,7 +60,7 @@ class LiTVIE(InfoExtractor):
|
||||
|
||||
all_episodes = [
|
||||
self.url_result(smuggle_url(
|
||||
self._URL_TEMPLATE % (view_data['contentType'], episode['contentId']),
|
||||
self._URL_TEMPLATE % (program_info['contentType'], episode['contentId']),
|
||||
{'force_noplaylist': True})) # To prevent infinite recursion
|
||||
for episode in season_list['episode']]
|
||||
|
||||
@@ -80,19 +79,15 @@ class LiTVIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
view_data = dict(map(lambda t: (t[0], t[2]), re.findall(
|
||||
r'viewData\.([a-zA-Z]+)\s*=\s*(["\'])([^"\']+)\2',
|
||||
webpage)))
|
||||
|
||||
vod_data = self._parse_json(self._search_regex(
|
||||
'var\s+vod\s*=\s*([^;]+)', webpage, 'VOD data', default='{}'),
|
||||
program_info = self._parse_json(self._search_regex(
|
||||
'var\s+programInfo\s*=\s*([^;]+)', webpage, 'VOD data', default='{}'),
|
||||
video_id)
|
||||
|
||||
season_list = list(vod_data.get('seasonList', {}).values())
|
||||
season_list = list(program_info.get('seasonList', {}).values())
|
||||
if season_list:
|
||||
if not noplaylist:
|
||||
return self._extract_playlist(
|
||||
season_list[0], video_id, vod_data, view_data,
|
||||
season_list[0], video_id, program_info,
|
||||
prompt=noplaylist_prompt)
|
||||
|
||||
if noplaylist_prompt:
|
||||
@@ -102,8 +97,8 @@ class LiTVIE(InfoExtractor):
|
||||
# endpoint gives the same result as the data embedded in the webpage.
|
||||
# If georestricted, there are no embedded data, so an extra request is
|
||||
# necessary to get the error code
|
||||
if 'assetId' not in view_data:
|
||||
view_data = self._download_json(
|
||||
if 'assetId' not in program_info:
|
||||
program_info = self._download_json(
|
||||
'https://www.litv.tv/vod/ajax/getProgramInfo', video_id,
|
||||
query={'contentId': video_id},
|
||||
headers={'Accept': 'application/json'})
|
||||
@@ -112,9 +107,9 @@ class LiTVIE(InfoExtractor):
|
||||
webpage, 'video data', default='{}'), video_id)
|
||||
if not video_data:
|
||||
payload = {
|
||||
'assetId': view_data['assetId'],
|
||||
'watchDevices': view_data['watchDevices'],
|
||||
'contentType': view_data['contentType'],
|
||||
'assetId': program_info['assetId'],
|
||||
'watchDevices': program_info['watchDevices'],
|
||||
'contentType': program_info['contentType'],
|
||||
}
|
||||
video_data = self._download_json(
|
||||
'https://www.litv.tv/vod/getMainUrl', video_id,
|
||||
@@ -136,11 +131,11 @@ class LiTVIE(InfoExtractor):
|
||||
# LiTV HLS segments doesn't like compressions
|
||||
a_format.setdefault('http_headers', {})['Youtubedl-no-compression'] = True
|
||||
|
||||
title = view_data['title'] + view_data.get('secondaryMark', '')
|
||||
description = view_data.get('description')
|
||||
thumbnail = view_data.get('imageFile')
|
||||
categories = [item['name'] for item in vod_data.get('category', [])]
|
||||
episode = int_or_none(view_data.get('episode'))
|
||||
title = program_info['title'] + program_info.get('secondaryMark', '')
|
||||
description = program_info.get('description')
|
||||
thumbnail = program_info.get('imageFile')
|
||||
categories = [item['name'] for item in program_info.get('category', [])]
|
||||
episode = int_or_none(program_info.get('episode'))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
||||
@@ -71,12 +71,15 @@ class MicrosoftVirtualAcademyIE(MicrosoftVirtualAcademyBaseIE):
|
||||
formats = []
|
||||
|
||||
for sources in settings.findall(compat_xpath('.//MediaSources')):
|
||||
if sources.get('videoType') == 'smoothstreaming':
|
||||
continue
|
||||
sources_type = sources.get('videoType')
|
||||
for source in sources.findall(compat_xpath('./MediaSource')):
|
||||
video_url = source.text
|
||||
if not video_url or not video_url.startswith('http'):
|
||||
continue
|
||||
if sources_type == 'smoothstreaming':
|
||||
formats.extend(self._extract_ism_formats(
|
||||
video_url, video_id, 'mss', fatal=False))
|
||||
continue
|
||||
video_mode = source.get('videoMode')
|
||||
height = int_or_none(self._search_regex(
|
||||
r'^(\d+)[pP]$', video_mode or '', 'height', default=None))
|
||||
|
||||
@@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class MovieClipsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www.)?movieclips\.com/videos/.+-(?P<id>\d+)(?:\?|$)'
|
||||
_VALID_URL = r'https?://(?:www\.)?movieclips\.com/videos/.+-(?P<id>\d+)(?:\?|$)'
|
||||
_TEST = {
|
||||
'url': 'http://www.movieclips.com/videos/warcraft-trailer-1-561180739597',
|
||||
'md5': '42b5a0352d4933a7bd54f2104f481244',
|
||||
|
||||
@@ -69,10 +69,9 @@ class MSNIE(InfoExtractor):
|
||||
if not format_url:
|
||||
continue
|
||||
ext = determine_ext(format_url)
|
||||
# .ism is not yet supported (see
|
||||
# https://github.com/rg3/youtube-dl/issues/8118)
|
||||
if ext == 'ism':
|
||||
continue
|
||||
formats.extend(self._extract_ism_formats(
|
||||
format_url + '/Manifest', display_id, 'mss', fatal=False))
|
||||
if 'm3u8' in format_url:
|
||||
# m3u8_native should not be used here until
|
||||
# https://github.com/rg3/youtube-dl/issues/9913 is fixed
|
||||
|
||||
@@ -4,6 +4,7 @@ import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .adobepass import AdobePassIE
|
||||
from .theplatform import ThePlatformIE
|
||||
from ..utils import (
|
||||
smuggle_url,
|
||||
url_basename,
|
||||
@@ -65,7 +66,7 @@ class NationalGeographicVideoIE(InfoExtractor):
|
||||
}
|
||||
|
||||
|
||||
class NationalGeographicIE(AdobePassIE):
|
||||
class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
||||
IE_NAME = 'natgeo'
|
||||
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:wild/)?[^/]+/(?:videos|episodes)/(?P<id>[^/?]+)'
|
||||
|
||||
@@ -110,25 +111,39 @@ class NationalGeographicIE(AdobePassIE):
|
||||
release_url = self._search_regex(
|
||||
r'video_auth_playlist_url\s*=\s*"([^"]+)"',
|
||||
webpage, 'release url')
|
||||
theplatform_path = self._search_regex(r'https?://link.theplatform.com/s/([^?]+)', release_url, 'theplatform path')
|
||||
video_id = theplatform_path.split('/')[-1]
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'switch': 'http',
|
||||
}
|
||||
is_auth = self._search_regex(r'video_is_auth\s*=\s*"([^"]+)"', webpage, 'is auth', fatal=False)
|
||||
if is_auth == 'auth':
|
||||
auth_resource_id = self._search_regex(
|
||||
r"video_auth_resourceId\s*=\s*'([^']+)'",
|
||||
webpage, 'auth resource id')
|
||||
query['auth'] = self._extract_mvpd_auth(url, display_id, 'natgeo', auth_resource_id)
|
||||
query['auth'] = self._extract_mvpd_auth(url, video_id, 'natgeo', auth_resource_id)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'ThePlatform',
|
||||
'url': smuggle_url(
|
||||
update_url_query(release_url, query),
|
||||
{'force_smil_url': True}),
|
||||
formats = []
|
||||
subtitles = {}
|
||||
for key, value in (('switch', 'http'), ('manifest', 'm3u')):
|
||||
tp_query = query.copy()
|
||||
tp_query.update({
|
||||
key: value,
|
||||
})
|
||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||
update_url_query(release_url, tp_query), video_id, 'Downloading %s SMIL data' % value)
|
||||
formats.extend(tp_formats)
|
||||
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = self._extract_theplatform_metadata(theplatform_path, display_id)
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'display_id': display_id,
|
||||
}
|
||||
})
|
||||
return info
|
||||
|
||||
|
||||
class NationalGeographicEpisodeGuideIE(InfoExtractor):
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .mtv import MTVServicesInfoExtractor
|
||||
from ..utils import update_url_query
|
||||
|
||||
@@ -69,7 +71,7 @@ class NickIE(MTVServicesInfoExtractor):
|
||||
|
||||
class NickDeIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'nick.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:nick\.de|nickelodeon\.nl)/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<host>nick\.de|nickelodeon\.(?:nl|at))/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.nick.de/playlist/3773-top-videos/videos/episode/17306-zu-wasser-und-zu-land-rauchende-erdnusse',
|
||||
'only_matching': True,
|
||||
@@ -79,15 +81,43 @@ class NickDeIE(MTVServicesInfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.nickelodeon.nl/shows/474-spongebob/videos/17403-een-kijkje-in-de-keuken-met-sandy-van-binnenuit',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nickelodeon.at/playlist/3773-top-videos/videos/episode/77993-das-letzte-gefecht',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_mrss_url(self, webpage, host):
|
||||
return update_url_query(self._search_regex(
|
||||
r'data-mrss=(["\'])(?P<url>http.+?)\1', webpage, 'mrss url', group='url'),
|
||||
{'siteKey': host})
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
host = mobj.group('host')
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
mrss_url = update_url_query(self._search_regex(
|
||||
r'data-mrss=(["\'])(?P<url>http.+?)\1', webpage, 'mrss url', group='url'),
|
||||
{'siteKey': 'nick.de'})
|
||||
mrss_url = self._extract_mrss_url(webpage, host)
|
||||
|
||||
return self._get_videos_info_from_url(mrss_url, video_id)
|
||||
|
||||
|
||||
class NickNightIE(NickDeIE):
|
||||
IE_NAME = 'nicknight'
|
||||
_VALID_URL = r'https?://(?:www\.)(?P<host>nicknight\.(?:de|at|tv))/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.nicknight.at/shows/977-awkward/videos/85987-nimmer-beste-freunde',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nicknight.at/shows/977-awkward',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nicknight.at/shows/1900-faking-it',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_mrss_url(self, webpage, *args):
|
||||
return self._search_regex(
|
||||
r'mrss\s*:\s*(["\'])(?P<url>http.+?)\1', webpage,
|
||||
'mrss url', group='url')
|
||||
|
||||
62
youtube_dl/extractor/nobelprize.py
Normal file
62
youtube_dl/extractor/nobelprize.py
Normal file
@@ -0,0 +1,62 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
js_to_json,
|
||||
mimetype2ext,
|
||||
determine_ext,
|
||||
update_url_query,
|
||||
get_element_by_attribute,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class NobelPrizeIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?nobelprize\.org/mediaplayer.*?\bid=(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.nobelprize.org/mediaplayer/?id=2636',
|
||||
'md5': '04c81e5714bb36cc4e2232fee1d8157f',
|
||||
'info_dict': {
|
||||
'id': '2636',
|
||||
'ext': 'mp4',
|
||||
'title': 'Announcement of the 2016 Nobel Prize in Physics',
|
||||
'description': 'md5:05beba57f4f5a4bbd4cf2ef28fcff739',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
media = self._parse_json(self._search_regex(
|
||||
r'(?s)var\s*config\s*=\s*({.+?});', webpage,
|
||||
'config'), video_id, js_to_json)['media']
|
||||
title = media['title']
|
||||
|
||||
formats = []
|
||||
for source in media.get('source', []):
|
||||
source_src = source.get('src')
|
||||
if not source_src:
|
||||
continue
|
||||
ext = mimetype2ext(source.get('type')) or determine_ext(source_src)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_src, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
update_url_query(source_src, {'hdcore': '3.7.0'}),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': source_src,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': get_element_by_attribute('itemprop', 'description', webpage),
|
||||
'duration': int_or_none(media.get('duration')),
|
||||
'formats': formats,
|
||||
}
|
||||
@@ -113,7 +113,17 @@ class NRKBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class NRKIE(NRKBaseIE):
|
||||
_VALID_URL = r'(?:nrk:|https?://(?:www\.)?nrk\.no/video/PS\*)(?P<id>\d+)'
|
||||
_VALID_URL = r'''(?x)
|
||||
(?:
|
||||
nrk:|
|
||||
https?://
|
||||
(?:
|
||||
(?:www\.)?nrk\.no/video/PS\*|
|
||||
v8-psapi\.nrk\.no/mediaelement/
|
||||
)
|
||||
)
|
||||
(?P<id>[^/?#&]+)
|
||||
'''
|
||||
_API_HOST = 'v8.psapi.nrk.no'
|
||||
_TESTS = [{
|
||||
# video
|
||||
@@ -137,6 +147,12 @@ class NRKIE(NRKBaseIE):
|
||||
'description': 'md5:a621f5cc1bd75c8d5104cb048c6b8568',
|
||||
'duration': 20,
|
||||
}
|
||||
}, {
|
||||
'url': 'nrk:ecc1b952-96dc-4a98-81b9-5296dc7a98d9',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://v8-psapi.nrk.no/mediaelement/ecc1b952-96dc-4a98-81b9-5296dc7a98d9',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import hmac
|
||||
@@ -6,11 +7,13 @@ import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
js_to_json,
|
||||
mimetype2ext,
|
||||
determine_ext,
|
||||
parse_iso8601,
|
||||
remove_start,
|
||||
)
|
||||
|
||||
|
||||
@@ -138,16 +141,83 @@ class NYTimesArticleIE(NYTimesBaseIE):
|
||||
'upload_date': '20150414',
|
||||
'uploader': 'Matthew Williams',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.nytimes.com/2016/10/14/podcasts/revelations-from-the-final-weeks.html',
|
||||
'md5': 'e0d52040cafb07662acf3c9132db3575',
|
||||
'info_dict': {
|
||||
'id': '100000004709062',
|
||||
'title': 'The Run-Up: ‘He Was Like an Octopus’',
|
||||
'ext': 'mp3',
|
||||
'description': 'md5:fb5c6b93b12efc51649b4847fe066ee4',
|
||||
'series': 'The Run-Up',
|
||||
'episode': '‘He Was Like an Octopus’',
|
||||
'episode_number': 20,
|
||||
'duration': 2130,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.nytimes.com/2016/10/16/books/review/inside-the-new-york-times-book-review-the-rise-of-hitler.html',
|
||||
'info_dict': {
|
||||
'id': '100000004709479',
|
||||
'title': 'The Rise of Hitler',
|
||||
'ext': 'mp3',
|
||||
'description': 'md5:bce877fd9e3444990cb141875fab0028',
|
||||
'creator': 'Pamela Paul',
|
||||
'duration': 3475,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.nytimes.com/news/minute/2014/03/17/times-minute-whats-next-in-crimea/?_php=true&_type=blogs&_php=true&_type=blogs&_r=1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_podcast_from_json(self, json, page_id, webpage):
|
||||
podcast_audio = self._parse_json(
|
||||
json, page_id, transform_source=js_to_json)
|
||||
|
||||
audio_data = podcast_audio['data']
|
||||
track = audio_data['track']
|
||||
|
||||
episode_title = track['title']
|
||||
video_url = track['source']
|
||||
|
||||
description = track.get('description') or self._html_search_meta(
|
||||
['og:description', 'twitter:description'], webpage)
|
||||
|
||||
podcast_title = audio_data.get('podcast', {}).get('title')
|
||||
title = ('%s: %s' % (podcast_title, episode_title)
|
||||
if podcast_title else episode_title)
|
||||
|
||||
episode = audio_data.get('podcast', {}).get('episode') or ''
|
||||
episode_number = int_or_none(self._search_regex(
|
||||
r'[Ee]pisode\s+(\d+)', episode, 'episode number', default=None))
|
||||
|
||||
return {
|
||||
'id': remove_start(podcast_audio.get('target'), 'FT') or page_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'creator': track.get('credit'),
|
||||
'series': podcast_title,
|
||||
'episode': episode_title,
|
||||
'episode_number': episode_number,
|
||||
'duration': int_or_none(track.get('duration')),
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
page_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
|
||||
video_id = self._html_search_regex(r'data-videoid="(\d+)"', webpage, 'video id')
|
||||
video_id = self._search_regex(
|
||||
r'data-videoid=["\'](\d+)', webpage, 'video id',
|
||||
default=None, fatal=False)
|
||||
if video_id is not None:
|
||||
return self._extract_video_from_id(video_id)
|
||||
|
||||
return self._extract_video_from_id(video_id)
|
||||
podcast_data = self._search_regex(
|
||||
(r'NYTD\.FlexTypes\.push\s*\(\s*({.+?})\s*\)\s*;\s*</script',
|
||||
r'NYTD\.FlexTypes\.push\s*\(\s*({.+})\s*\)\s*;'),
|
||||
webpage, 'podcast data')
|
||||
return self._extract_podcast_from_json(podcast_data, page_id, webpage)
|
||||
|
||||
@@ -56,8 +56,8 @@ class OnetBaseIE(InfoExtractor):
|
||||
continue
|
||||
ext = determine_ext(video_url)
|
||||
if format_id == 'ism':
|
||||
# TODO: Support Microsoft Smooth Streaming
|
||||
continue
|
||||
formats.extend(self._extract_ism_formats(
|
||||
video_url, video_id, 'mss', fatal=False))
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
video_url, video_id, mpd_id='dash', fatal=False))
|
||||
|
||||
@@ -70,14 +70,19 @@ class OpenloadIE(InfoExtractor):
|
||||
r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
|
||||
webpage, 'encrypted data')
|
||||
|
||||
magic = compat_ord(enc_data[-1])
|
||||
video_url_chars = []
|
||||
|
||||
for idx, c in enumerate(enc_data):
|
||||
j = compat_ord(c)
|
||||
if j == magic:
|
||||
j -= 1
|
||||
elif j == magic - 1:
|
||||
j += 1
|
||||
if j >= 33 and j <= 126:
|
||||
j = ((j + 14) % 94) + 33
|
||||
if idx == len(enc_data) - 1:
|
||||
j += 2
|
||||
j += 3
|
||||
video_url_chars += compat_chr(j)
|
||||
|
||||
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
|
||||
|
||||
@@ -115,12 +115,22 @@ class ORFTVthekIE(InfoExtractor):
|
||||
self._check_formats(formats, video_id)
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
for sub in sd.get('subtitles', []):
|
||||
sub_src = sub.get('src')
|
||||
if not sub_src:
|
||||
continue
|
||||
subtitles.setdefault(sub.get('lang', 'de-AT'), []).append({
|
||||
'url': sub_src,
|
||||
})
|
||||
|
||||
upload_date = unified_strdate(sd.get('created_date'))
|
||||
entries.append({
|
||||
'_type': 'video',
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'description': sd.get('description'),
|
||||
'duration': int_or_none(sd.get('duration_in_seconds')),
|
||||
'upload_date': upload_date,
|
||||
|
||||
91
youtube_dl/extractor/pandatv.py
Normal file
91
youtube_dl/extractor/pandatv.py
Normal file
@@ -0,0 +1,91 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
qualities,
|
||||
)
|
||||
|
||||
|
||||
class PandaTVIE(InfoExtractor):
|
||||
IE_DESC = '熊猫TV'
|
||||
_VALID_URL = r'http://(?:www\.)?panda\.tv/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.panda.tv/10091',
|
||||
'info_dict': {
|
||||
'id': '10091',
|
||||
'title': 're:.+',
|
||||
'uploader': '囚徒',
|
||||
'ext': 'flv',
|
||||
'is_live': True,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Live stream is offline',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
config = self._download_json(
|
||||
'http://www.panda.tv/api_room?roomid=%s' % video_id, video_id)
|
||||
|
||||
error_code = config.get('errno', 0)
|
||||
if error_code is not 0:
|
||||
raise ExtractorError(
|
||||
'%s returned error %s: %s'
|
||||
% (self.IE_NAME, error_code, config['errmsg']),
|
||||
expected=True)
|
||||
|
||||
data = config['data']
|
||||
video_info = data['videoinfo']
|
||||
|
||||
# 2 = live, 3 = offline
|
||||
if video_info.get('status') != '2':
|
||||
raise ExtractorError(
|
||||
'Live stream is offline', expected=True)
|
||||
|
||||
title = data['roominfo']['name']
|
||||
uploader = data.get('hostinfo', {}).get('name')
|
||||
room_key = video_info['room_key']
|
||||
stream_addr = video_info.get(
|
||||
'stream_addr', {'OD': '1', 'HD': '1', 'SD': '1'})
|
||||
|
||||
# Reverse engineered from web player swf
|
||||
# (http://s6.pdim.gs/static/07153e425f581151.swf at the moment of
|
||||
# writing).
|
||||
plflag0, plflag1 = video_info['plflag'].split('_')
|
||||
plflag0 = int(plflag0) - 1
|
||||
if plflag1 == '21':
|
||||
plflag0 = 10
|
||||
plflag1 = '4'
|
||||
live_panda = 'live_panda' if plflag0 < 1 else ''
|
||||
|
||||
quality_key = qualities(['OD', 'HD', 'SD'])
|
||||
suffix = ['_small', '_mid', '']
|
||||
formats = []
|
||||
for k, v in stream_addr.items():
|
||||
if v != '1':
|
||||
continue
|
||||
quality = quality_key(k)
|
||||
if quality <= 0:
|
||||
continue
|
||||
for pref, (ext, pl) in enumerate((('m3u8', '-hls'), ('flv', ''))):
|
||||
formats.append({
|
||||
'url': 'http://pl%s%s.live.panda.tv/live_panda/%s%s%s.%s'
|
||||
% (pl, plflag1, room_key, live_panda, suffix[quality], ext),
|
||||
'format_id': '%s-%s' % (k, ext),
|
||||
'quality': quality,
|
||||
'source_preference': pref,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': self._live_title(title),
|
||||
'uploader': uploader,
|
||||
'formats': formats,
|
||||
'is_live': True,
|
||||
}
|
||||
@@ -4,7 +4,6 @@ import collections
|
||||
import json
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
@@ -12,6 +11,7 @@ from ..compat import (
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
dict_get,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
@@ -23,12 +23,12 @@ from ..utils import (
|
||||
|
||||
|
||||
class PluralsightBaseIE(InfoExtractor):
|
||||
_API_BASE = 'http://app.pluralsight.com'
|
||||
_API_BASE = 'https://app.pluralsight.com'
|
||||
|
||||
|
||||
class PluralsightIE(PluralsightBaseIE):
|
||||
IE_NAME = 'pluralsight'
|
||||
_VALID_URL = r'https?://(?:(?:www|app)\.)?pluralsight\.com/training/player\?'
|
||||
_VALID_URL = r'https?://(?:(?:www|app)\.)?pluralsight\.com/(?:training/)?player\?'
|
||||
_LOGIN_URL = 'https://app.pluralsight.com/id/'
|
||||
|
||||
_NETRC_MACHINE = 'pluralsight'
|
||||
@@ -50,6 +50,9 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
# available without pluralsight account
|
||||
'url': 'http://app.pluralsight.com/training/player?author=scott-allen&name=angularjs-get-started-m1-introduction&mode=live&clip=0&course=angularjs-get-started',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://app.pluralsight.com/player?course=ccna-intro-networking&author=ross-bagurdes&name=ccna-intro-networking-m06&clip=0',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_initialize(self):
|
||||
@@ -99,7 +102,7 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
'm': name,
|
||||
}
|
||||
captions = self._download_json(
|
||||
'%s/training/Player/Captions' % self._API_BASE, video_id,
|
||||
'%s/player/retrieve-captions' % self._API_BASE, video_id,
|
||||
'Downloading captions JSON', 'Unable to download captions JSON',
|
||||
fatal=False, data=json.dumps(captions_post).encode('utf-8'),
|
||||
headers={'Content-Type': 'application/json;charset=utf-8'})
|
||||
@@ -117,14 +120,17 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
@staticmethod
|
||||
def _convert_subtitles(duration, subs):
|
||||
srt = ''
|
||||
TIME_OFFSET_KEYS = ('displayTimeOffset', 'DisplayTimeOffset')
|
||||
TEXT_KEYS = ('text', 'Text')
|
||||
for num, current in enumerate(subs):
|
||||
current = subs[num]
|
||||
start, text = float_or_none(
|
||||
current.get('DisplayTimeOffset')), current.get('Text')
|
||||
start, text = (
|
||||
float_or_none(dict_get(current, TIME_OFFSET_KEYS)),
|
||||
dict_get(current, TEXT_KEYS))
|
||||
if start is None or text is None:
|
||||
continue
|
||||
end = duration if num == len(subs) - 1 else float_or_none(
|
||||
subs[num + 1].get('DisplayTimeOffset'))
|
||||
dict_get(subs[num + 1], TIME_OFFSET_KEYS))
|
||||
if end is None:
|
||||
continue
|
||||
srt += os.linesep.join(
|
||||
@@ -144,28 +150,22 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
author = qs.get('author', [None])[0]
|
||||
name = qs.get('name', [None])[0]
|
||||
clip_id = qs.get('clip', [None])[0]
|
||||
course = qs.get('course', [None])[0]
|
||||
course_name = qs.get('course', [None])[0]
|
||||
|
||||
if any(not f for f in (author, name, clip_id, course,)):
|
||||
if any(not f for f in (author, name, clip_id, course_name,)):
|
||||
raise ExtractorError('Invalid URL', expected=True)
|
||||
|
||||
display_id = '%s-%s' % (name, clip_id)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
parsed_url = compat_urlparse.urlparse(url)
|
||||
|
||||
modules = self._search_regex(
|
||||
r'moduleCollection\s*:\s*new\s+ModuleCollection\((\[.+?\])\s*,\s*\$rootScope\)',
|
||||
webpage, 'modules', default=None)
|
||||
payload_url = compat_urlparse.urlunparse(parsed_url._replace(
|
||||
netloc='app.pluralsight.com', path='player/api/v1/payload'))
|
||||
|
||||
if modules:
|
||||
collection = self._parse_json(modules, display_id)
|
||||
else:
|
||||
# Webpage may be served in different layout (see
|
||||
# https://github.com/rg3/youtube-dl/issues/7607)
|
||||
collection = self._parse_json(
|
||||
self._search_regex(
|
||||
r'var\s+initialState\s*=\s*({.+?});\n', webpage, 'initial state'),
|
||||
display_id)['course']['modules']
|
||||
course = self._download_json(
|
||||
payload_url, display_id, headers={'Referer': url})['payload']['course']
|
||||
|
||||
collection = course['modules']
|
||||
|
||||
module, clip = None, None
|
||||
|
||||
@@ -206,8 +206,7 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
|
||||
# Some courses also offer widescreen resolution for high quality (see
|
||||
# https://github.com/rg3/youtube-dl/issues/7766)
|
||||
widescreen = True if re.search(
|
||||
r'courseSupportsWidescreenVideoFormats\s*:\s*true', webpage) else False
|
||||
widescreen = course.get('supportsWideScreenVideoFormats') is True
|
||||
best_quality = 'high-widescreen' if widescreen else 'high'
|
||||
if widescreen:
|
||||
for allowed_quality in ALLOWED_QUALITIES:
|
||||
@@ -236,19 +235,19 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
for quality in qualities_:
|
||||
f = QUALITIES[quality].copy()
|
||||
clip_post = {
|
||||
'a': author,
|
||||
'cap': 'false',
|
||||
'cn': clip_id,
|
||||
'course': course,
|
||||
'lc': 'en',
|
||||
'm': name,
|
||||
'mt': ext,
|
||||
'q': '%dx%d' % (f['width'], f['height']),
|
||||
'author': author,
|
||||
'includeCaptions': False,
|
||||
'clipIndex': int(clip_id),
|
||||
'courseName': course_name,
|
||||
'locale': 'en',
|
||||
'moduleName': name,
|
||||
'mediaType': ext,
|
||||
'quality': '%dx%d' % (f['width'], f['height']),
|
||||
}
|
||||
format_id = '%s-%s' % (ext, quality)
|
||||
clip_url = self._download_webpage(
|
||||
'%s/training/Player/ViewClip' % self._API_BASE, display_id,
|
||||
'Downloading %s URL' % format_id, fatal=False,
|
||||
viewclip = self._download_json(
|
||||
'%s/video/clips/viewclip' % self._API_BASE, display_id,
|
||||
'Downloading %s viewclip JSON' % format_id, fatal=False,
|
||||
data=json.dumps(clip_post).encode('utf-8'),
|
||||
headers={'Content-Type': 'application/json;charset=utf-8'})
|
||||
|
||||
@@ -262,15 +261,28 @@ class PluralsightIE(PluralsightBaseIE):
|
||||
random.randint(2, 5), display_id,
|
||||
'%(video_id)s: Waiting for %(timeout)s seconds to avoid throttling')
|
||||
|
||||
if not clip_url:
|
||||
if not viewclip:
|
||||
continue
|
||||
f.update({
|
||||
'url': clip_url,
|
||||
'ext': ext,
|
||||
'format_id': format_id,
|
||||
'quality': quality_key(quality),
|
||||
})
|
||||
formats.append(f)
|
||||
|
||||
clip_urls = viewclip.get('urls')
|
||||
if not isinstance(clip_urls, list):
|
||||
continue
|
||||
|
||||
for clip_url_data in clip_urls:
|
||||
clip_url = clip_url_data.get('url')
|
||||
if not clip_url:
|
||||
continue
|
||||
cdn = clip_url_data.get('cdn')
|
||||
clip_f = f.copy()
|
||||
clip_f.update({
|
||||
'url': clip_url,
|
||||
'ext': ext,
|
||||
'format_id': '%s-%s' % (format_id, cdn) if cdn else format_id,
|
||||
'quality': quality_key(quality),
|
||||
'source_preference': int_or_none(clip_url_data.get('rank')),
|
||||
})
|
||||
formats.append(clip_f)
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
duration = int_or_none(
|
||||
|
||||
76
youtube_dl/extractor/rentv.py
Normal file
76
youtube_dl/extractor/rentv.py
Normal file
@@ -0,0 +1,76 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..compat import compat_str
|
||||
|
||||
|
||||
class RENTVIE(JWPlatformBaseIE):
|
||||
_VALID_URL = r'(?:rentv:|https?://(?:www\.)?ren\.tv/(?:player|video/epizod)/)(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://ren.tv/video/epizod/118577',
|
||||
'md5': 'd91851bf9af73c0ad9b2cdf76c127fbb',
|
||||
'info_dict': {
|
||||
'id': '118577',
|
||||
'ext': 'mp4',
|
||||
'title': 'Документальный спецпроект: "Промывка мозгов. Технологии XXI века"'
|
||||
}
|
||||
}, {
|
||||
'url': 'http://ren.tv/player/118577',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'rentv:118577',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage('http://ren.tv/player/' + video_id, video_id)
|
||||
jw_config = self._parse_json(self._search_regex(
|
||||
r'config\s*=\s*({.+});', webpage, 'jw config'), video_id)
|
||||
return self._parse_jwplayer_data(jw_config, video_id, m3u8_id='hls')
|
||||
|
||||
|
||||
class RENTVArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ren\.tv/novosti/\d{4}-\d{2}-\d{2}/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://ren.tv/novosti/2016-10-26/video-mikroavtobus-popavshiy-v-dtp-s-gruzovikami-v-podmoskove-prevratilsya-v',
|
||||
'md5': 'ebd63c4680b167693745ab91343df1d6',
|
||||
'info_dict': {
|
||||
'id': '136472',
|
||||
'ext': 'mp4',
|
||||
'title': 'Видео: микроавтобус, попавший в ДТП с грузовиками в Подмосковье, превратился в груду металла',
|
||||
'description': 'Жертвами столкновения двух фур и микроавтобуса, по последним данным, стали семь человек.',
|
||||
}
|
||||
}, {
|
||||
# TODO: invalid m3u8
|
||||
'url': 'http://ren.tv/novosti/2015-09-25/sluchaynyy-prohozhiy-poymal-avtougonshchika-v-murmanske-video',
|
||||
'info_dict': {
|
||||
'id': 'playlist',
|
||||
'ext': 'mp4',
|
||||
'title': 'Случайный прохожий поймал автоугонщика в Мурманске. ВИДЕО | РЕН ТВ',
|
||||
'uploader': 'ren.tv',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 downloads
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
drupal_settings = self._parse_json(self._search_regex(
|
||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
||||
webpage, 'drupal settings'), display_id)
|
||||
|
||||
entries = []
|
||||
for config_profile in drupal_settings.get('ren_jwplayer', {}).values():
|
||||
media_id = config_profile.get('mediaid')
|
||||
if not media_id:
|
||||
continue
|
||||
media_id = compat_str(media_id)
|
||||
entries.append(self.url_result('rentv:' + media_id, 'RENTV', media_id))
|
||||
return self.playlist_result(entries, display_id)
|
||||
@@ -10,11 +10,38 @@ from ..utils import (
|
||||
)
|
||||
|
||||
|
||||
class SharedIE(InfoExtractor):
|
||||
IE_DESC = 'shared.sx and vivo.sx'
|
||||
_VALID_URL = r'https?://(?:shared|vivo)\.sx/(?P<id>[\da-z]{10})'
|
||||
class SharedBaseIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
_TESTS = [{
|
||||
webpage, urlh = self._download_webpage_handle(url, video_id)
|
||||
|
||||
if self._FILE_NOT_FOUND in webpage:
|
||||
raise ExtractorError(
|
||||
'Video %s does not exist' % video_id, expected=True)
|
||||
|
||||
video_url = self._extract_video_url(webpage, video_id, url)
|
||||
|
||||
title = base64.b64decode(self._html_search_meta(
|
||||
'full:title', webpage, 'title').encode('utf-8')).decode('utf-8')
|
||||
filesize = int_or_none(self._html_search_meta(
|
||||
'full:size', webpage, 'file size', fatal=False))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'ext': 'mp4',
|
||||
'filesize': filesize,
|
||||
'title': title,
|
||||
}
|
||||
|
||||
|
||||
class SharedIE(SharedBaseIE):
|
||||
IE_DESC = 'shared.sx'
|
||||
_VALID_URL = r'https?://shared\.sx/(?P<id>[\da-z]{10})'
|
||||
_FILE_NOT_FOUND = '>File does not exist<'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://shared.sx/0060718775',
|
||||
'md5': '106fefed92a8a2adb8c98e6a0652f49b',
|
||||
'info_dict': {
|
||||
@@ -23,7 +50,32 @@ class SharedIE(InfoExtractor):
|
||||
'title': 'Bmp4',
|
||||
'filesize': 1720110,
|
||||
},
|
||||
}, {
|
||||
}
|
||||
|
||||
def _extract_video_url(self, webpage, video_id, url):
|
||||
download_form = self._hidden_inputs(webpage)
|
||||
|
||||
video_page = self._download_webpage(
|
||||
url, video_id, 'Downloading video page',
|
||||
data=urlencode_postdata(download_form),
|
||||
headers={
|
||||
'Content-Type': 'application/x-www-form-urlencoded',
|
||||
'Referer': url,
|
||||
})
|
||||
|
||||
video_url = self._html_search_regex(
|
||||
r'data-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
video_page, 'video URL', group='url')
|
||||
|
||||
return video_url
|
||||
|
||||
|
||||
class VivoIE(SharedBaseIE):
|
||||
IE_DESC = 'vivo.sx'
|
||||
_VALID_URL = r'https?://vivo\.sx/(?P<id>[\da-z]{10})'
|
||||
_FILE_NOT_FOUND = '>The file you have requested does not exists or has been removed'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://vivo.sx/d7ddda0e78',
|
||||
'md5': '15b3af41be0b4fe01f4df075c2678b2c',
|
||||
'info_dict': {
|
||||
@@ -32,43 +84,13 @@ class SharedIE(InfoExtractor):
|
||||
'title': 'Chicken',
|
||||
'filesize': 528031,
|
||||
},
|
||||
}]
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage, urlh = self._download_webpage_handle(url, video_id)
|
||||
|
||||
if '>File does not exist<' in webpage:
|
||||
raise ExtractorError(
|
||||
'Video %s does not exist' % video_id, expected=True)
|
||||
|
||||
download_form = self._hidden_inputs(webpage)
|
||||
|
||||
video_page = self._download_webpage(
|
||||
urlh.geturl(), video_id, 'Downloading video page',
|
||||
data=urlencode_postdata(download_form),
|
||||
headers={
|
||||
'Content-Type': 'application/x-www-form-urlencoded',
|
||||
'Referer': urlh.geturl(),
|
||||
})
|
||||
|
||||
video_url = self._html_search_regex(
|
||||
r'data-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
video_page, 'video URL', group='url')
|
||||
title = base64.b64decode(self._html_search_meta(
|
||||
'full:title', webpage, 'title').encode('utf-8')).decode('utf-8')
|
||||
filesize = int_or_none(self._html_search_meta(
|
||||
'full:size', webpage, 'file size', fatal=False))
|
||||
thumbnail = self._html_search_regex(
|
||||
r'data-poster=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
video_page, 'thumbnail', default=None, group='url')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'ext': 'mp4',
|
||||
'filesize': filesize,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
}
|
||||
def _extract_video_url(self, webpage, video_id, *args):
|
||||
return self._parse_json(
|
||||
self._search_regex(
|
||||
r'InitializeStream\s*\(\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
webpage, 'stream', group='url'),
|
||||
video_id,
|
||||
transform_source=lambda x: base64.b64decode(
|
||||
x.encode('ascii')).decode('utf-8'))[0]
|
||||
|
||||
@@ -69,7 +69,8 @@ class TVPIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
video_id = self._search_regex([
|
||||
r'<iframe[^>]+src="[^"]*?object_id=(\d+)',
|
||||
"object_id\s*:\s*'(\d+)'"], webpage, 'video id')
|
||||
r"object_id\s*:\s*'(\d+)'",
|
||||
r'data-video-id="(\d+)"'], webpage, 'video id', default=page_id)
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': 'tvp:' + video_id,
|
||||
@@ -138,6 +139,9 @@ class TVPEmbedIE(InfoExtractor):
|
||||
# formats.extend(self._extract_mpd_formats(
|
||||
# video_url_base + '.ism/video.mpd',
|
||||
# video_id, mpd_id='dash', fatal=False))
|
||||
formats.extend(self._extract_ism_formats(
|
||||
video_url_base + '.ism/Manifest',
|
||||
video_id, 'mss', fatal=False))
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
video_url_base + '.ism/video.f4m',
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
|
||||
@@ -398,7 +398,7 @@ class TwitchStreamIE(TwitchBaseIE):
|
||||
channel_id = self._match_id(url)
|
||||
|
||||
stream = self._call_api(
|
||||
'kraken/streams/%s' % channel_id, channel_id,
|
||||
'kraken/streams/%s?stream_type=all' % channel_id, channel_id,
|
||||
'Downloading stream JSON').get('stream')
|
||||
|
||||
if not stream:
|
||||
@@ -417,6 +417,7 @@ class TwitchStreamIE(TwitchBaseIE):
|
||||
query = {
|
||||
'allow_source': 'true',
|
||||
'allow_audio_only': 'true',
|
||||
'allow_spectre': 'true',
|
||||
'p': random.randint(1000000, 10000000),
|
||||
'player': 'twitchweb',
|
||||
'segment_preference': '4',
|
||||
|
||||
@@ -5,17 +5,20 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class URPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?urplay\.se/program/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://(?:www\.)?ur(?:play|skola)\.se/(?:program|Produkter)/(?P<id>[0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://urplay.se/program/190031-tripp-trapp-trad-sovkudde',
|
||||
'md5': '15ca67b63fd8fb320ac2bcd854bad7b6',
|
||||
'md5': 'ad5f0de86f16ca4c8062cd103959a9eb',
|
||||
'info_dict': {
|
||||
'id': '190031',
|
||||
'ext': 'mp4',
|
||||
'title': 'Tripp, Trapp, Träd : Sovkudde',
|
||||
'description': 'md5:b86bffdae04a7e9379d1d7e5947df1d1',
|
||||
}
|
||||
}
|
||||
},
|
||||
}, {
|
||||
'url': 'http://urskola.se/Produkter/155794-Smasagor-meankieli-Grodan-i-vida-varlden',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
@@ -27,30 +30,17 @@ class URPlayIE(InfoExtractor):
|
||||
|
||||
formats = []
|
||||
for quality_attr, quality, preference in (('', 'sd', 0), ('_hd', 'hd', 1)):
|
||||
file_rtmp = urplayer_data.get('file_rtmp' + quality_attr)
|
||||
if file_rtmp:
|
||||
formats.append({
|
||||
'url': 'rtmp://%s/urplay/mp4:%s' % (host, file_rtmp),
|
||||
'format_id': quality + '-rtmp',
|
||||
'ext': 'flv',
|
||||
'preference': preference,
|
||||
})
|
||||
file_http = urplayer_data.get('file_http' + quality_attr) or urplayer_data.get('file_http_sub' + quality_attr)
|
||||
if file_http:
|
||||
file_http_base_url = 'http://%s/%s' % (host, file_http)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
file_http_base_url + 'manifest.f4m', video_id,
|
||||
preference, '%s-hds' % quality, fatal=False))
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
file_http_base_url + 'playlist.m3u8', video_id, 'mp4',
|
||||
'm3u8_native', preference, '%s-hls' % quality, fatal=False))
|
||||
formats.extend(self._extract_wowza_formats(
|
||||
'http://%s/%splaylist.m3u8' % (host, file_http), video_id, skip_protocols=['rtmp', 'rtsp']))
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
for subtitle in urplayer_data.get('subtitles', []):
|
||||
subtitle_url = subtitle.get('file')
|
||||
kind = subtitle.get('kind')
|
||||
if subtitle_url or kind and kind != 'captions':
|
||||
if not subtitle_url or (kind and kind != 'captions'):
|
||||
continue
|
||||
subtitles.setdefault(subtitle.get('label', 'Svenska'), []).append({
|
||||
'url': subtitle_url,
|
||||
|
||||
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class VesselIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?vessel\.com/(?:videos|embed)/(?P<id>[0-9a-zA-Z]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?vessel\.com/(?:videos|embed)/(?P<id>[0-9a-zA-Z-_]+)'
|
||||
_API_URL_TEMPLATE = 'https://www.vessel.com/api/view/items/%s'
|
||||
_LOGIN_URL = 'https://www.vessel.com/api/account/login'
|
||||
_NETRC_MACHINE = 'vessel'
|
||||
@@ -32,12 +32,18 @@ class VesselIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://www.vessel.com/embed/G4U7gUJ6a?w=615&h=346',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.vessel.com/videos/F01_dsLj1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.vessel.com/videos/RRX-sir-J',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return [url for _, url in re.findall(
|
||||
r'<iframe[^>]+src=(["\'])((?:https?:)?//(?:www\.)?vessel\.com/embed/[0-9a-zA-Z]+.*?)\1',
|
||||
r'<iframe[^>]+src=(["\'])((?:https?:)?//(?:www\.)?vessel\.com/embed/[0-9a-zA-Z-_]+.*?)\1',
|
||||
webpage)]
|
||||
|
||||
@staticmethod
|
||||
|
||||
@@ -1,10 +1,14 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..utils import (
|
||||
decode_packed_codes,
|
||||
js_to_json,
|
||||
NO_DEFAULT,
|
||||
PACKED_CODES_RE,
|
||||
)
|
||||
|
||||
|
||||
@@ -35,10 +39,17 @@ class VidziIE(JWPlatformBaseIE):
|
||||
title = self._html_search_regex(
|
||||
r'(?s)<h2 class="video-title">(.*?)</h2>', webpage, 'title')
|
||||
|
||||
code = decode_packed_codes(webpage).replace('\\\'', '\'')
|
||||
jwplayer_data = self._parse_json(
|
||||
self._search_regex(r'setup\(([^)]+)\)', code, 'jwplayer data'),
|
||||
video_id, transform_source=js_to_json)
|
||||
packed_codes = [mobj.group(0) for mobj in re.finditer(
|
||||
PACKED_CODES_RE, webpage)]
|
||||
for num, pc in enumerate(packed_codes, 1):
|
||||
code = decode_packed_codes(pc).replace('\\\'', '\'')
|
||||
jwplayer_data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'setup\(([^)]+)\)', code, 'jwplayer data',
|
||||
default=NO_DEFAULT if num == len(packed_codes) else '{}'),
|
||||
video_id, transform_source=js_to_json)
|
||||
if jwplayer_data:
|
||||
break
|
||||
|
||||
info_dict = self._parse_jwplayer_data(jwplayer_data, video_id, require_title=False)
|
||||
info_dict['title'] = title
|
||||
|
||||
@@ -49,7 +49,7 @@ class VierIE(InfoExtractor):
|
||||
webpage, 'filename')
|
||||
|
||||
playlist_url = 'http://vod.streamcloud.be/%s/_definst_/mp4:%s.mp4/playlist.m3u8' % (application, filename)
|
||||
formats = self._extract_wowza_formats(playlist_url, display_id)
|
||||
formats = self._extract_wowza_formats(playlist_url, display_id, skip_protocols=['dash'])
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = self._og_search_title(webpage, default=display_id)
|
||||
|
||||
@@ -322,6 +322,22 @@ class VimeoIE(VimeoBaseInfoExtractor):
|
||||
},
|
||||
'expected_warnings': ['Unable to download JSON metadata'],
|
||||
},
|
||||
{
|
||||
# redirects to ondemand extractor and should be passed throught it
|
||||
# for successful extraction
|
||||
'url': 'https://vimeo.com/73445910',
|
||||
'info_dict': {
|
||||
'id': '73445910',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Reluctant Revolutionary',
|
||||
'uploader': '10Ft Films',
|
||||
'uploader_url': 're:https?://(?:www\.)?vimeo\.com/tenfootfilms',
|
||||
'uploader_id': 'tenfootfilms',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
{
|
||||
'url': 'http://vimeo.com/moogaloop.swf?clip_id=2539741',
|
||||
'only_matching': True,
|
||||
@@ -414,7 +430,12 @@ class VimeoIE(VimeoBaseInfoExtractor):
|
||||
# Retrieve video webpage to extract further information
|
||||
request = sanitized_Request(url, headers=headers)
|
||||
try:
|
||||
webpage = self._download_webpage(request, video_id)
|
||||
webpage, urlh = self._download_webpage_handle(request, video_id)
|
||||
# Some URLs redirect to ondemand can't be extracted with
|
||||
# this extractor right away thus should be passed through
|
||||
# ondemand extractor (e.g. https://vimeo.com/73445910)
|
||||
if VimeoOndemandIE.suitable(urlh.geturl()):
|
||||
return self.url_result(urlh.geturl(), VimeoOndemandIE.ie_key())
|
||||
except ExtractorError as ee:
|
||||
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
|
||||
errmsg = ee.cause.read()
|
||||
|
||||
@@ -3,7 +3,6 @@ from __future__ import unicode_literals
|
||||
|
||||
import collections
|
||||
import re
|
||||
import json
|
||||
import sys
|
||||
|
||||
from .common import InfoExtractor
|
||||
@@ -369,8 +368,18 @@ class VKIE(VKBaseIE):
|
||||
opts_url = 'http:' + opts_url
|
||||
return self.url_result(opts_url)
|
||||
|
||||
data_json = self._search_regex(r'var\s+vars\s*=\s*({.+?});', info_page, 'vars')
|
||||
data = json.loads(data_json)
|
||||
# vars does not look to be served anymore since 24.10.2016
|
||||
data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'var\s+vars\s*=\s*({.+?});', info_page, 'vars', default='{}'),
|
||||
video_id, fatal=False)
|
||||
|
||||
# <!json> is served instead
|
||||
if not data:
|
||||
data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'<!json>\s*({.+?})\s*<!>', info_page, 'json'),
|
||||
video_id)['player']['params'][0]
|
||||
|
||||
title = unescapeHTML(data['md_title'])
|
||||
|
||||
|
||||
@@ -1867,7 +1867,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
||||
'title': 'Uploads from Interstellar Movie',
|
||||
'id': 'UUXw-G3eDE9trcvY2sBMM_aA',
|
||||
},
|
||||
'playlist_mincout': 21,
|
||||
'playlist_mincount': 21,
|
||||
}, {
|
||||
# Playlist URL that does not actually serve a playlist
|
||||
'url': 'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4',
|
||||
@@ -1890,6 +1890,27 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [YoutubeIE.ie_key()],
|
||||
}, {
|
||||
'url': 'https://youtu.be/yeWKywCrFtk?list=PL2qgrgXsNUG5ig9cat4ohreBjYLAPC0J5',
|
||||
'info_dict': {
|
||||
'id': 'yeWKywCrFtk',
|
||||
'ext': 'mp4',
|
||||
'title': 'Small Scale Baler and Braiding Rugs',
|
||||
'uploader': 'Backus-Page House Museum',
|
||||
'uploader_id': 'backuspagemuseum',
|
||||
'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/backuspagemuseum',
|
||||
'upload_date': '20161008',
|
||||
'license': 'Standard YouTube License',
|
||||
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
|
||||
'categories': ['Nonprofits & Activism'],
|
||||
'tags': list,
|
||||
'like_count': int,
|
||||
'dislike_count': int,
|
||||
},
|
||||
'params': {
|
||||
'noplaylist': True,
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
|
||||
'only_matching': True,
|
||||
@@ -1971,8 +1992,10 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
||||
def _check_download_just_video(self, url, playlist_id):
|
||||
# Check if it's a video-specific URL
|
||||
query_dict = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
if 'v' in query_dict:
|
||||
video_id = query_dict['v'][0]
|
||||
video_id = query_dict.get('v', [None])[0] or self._search_regex(
|
||||
r'(?:^|//)youtu\.be/([0-9A-Za-z_-]{11})', url,
|
||||
'video id', default=None)
|
||||
if video_id:
|
||||
if self._downloader.params.get('noplaylist'):
|
||||
self.to_screen('Downloading just video %s because of --no-playlist' % video_id)
|
||||
return video_id, self.url_result(video_id, 'Youtube', video_id=video_id)
|
||||
|
||||
@@ -40,7 +40,7 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
|
||||
'Skipping embedding the thumbnail because the file is missing.')
|
||||
return [], info
|
||||
|
||||
if info['ext'] in ('mp3', 'mkv'):
|
||||
if info['ext'] == 'mp3':
|
||||
options = [
|
||||
'-c', 'copy', '-map', '0', '-map', '1',
|
||||
'-metadata:s:v', 'title="Album cover"', '-metadata:s:v', 'comment="Cover (Front)"']
|
||||
|
||||
@@ -278,7 +278,7 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
|
||||
|
||||
prefix, sep, ext = path.rpartition('.') # not os.path.splitext, since the latter does not work on unicode in all setups
|
||||
new_path = prefix + sep + extension
|
||||
|
||||
|
||||
information['filepath'] = new_path
|
||||
information['ext'] = extension
|
||||
|
||||
|
||||
@@ -165,6 +165,8 @@ DATE_FORMATS_MONTH_FIRST.extend([
|
||||
'%m/%d/%Y %H:%M:%S',
|
||||
])
|
||||
|
||||
PACKED_CODES_RE = r"}\('(.+)',(\d+),(\d+),'([^']+)'\.split\('\|'\)"
|
||||
|
||||
|
||||
def preferredencoding():
|
||||
"""Get preferred encoding.
|
||||
@@ -1689,6 +1691,10 @@ def url_basename(url):
|
||||
return path.strip('/').split('/')[-1]
|
||||
|
||||
|
||||
def base_url(url):
|
||||
return re.match(r'https?://[^?#&]+/', url).group()
|
||||
|
||||
|
||||
class HEADRequest(compat_urllib_request.Request):
|
||||
def get_method(self):
|
||||
return 'HEAD'
|
||||
@@ -1816,8 +1822,12 @@ def get_exe_version(exe, args=['--version'],
|
||||
""" Returns the version of the specified executable,
|
||||
or False if the executable is not present """
|
||||
try:
|
||||
# STDIN should be redirected too. On UNIX-like systems, ffmpeg triggers
|
||||
# SIGTTOU if youtube-dl is run in the background.
|
||||
# See https://github.com/rg3/youtube-dl/issues/955#issuecomment-209789656
|
||||
out, _ = subprocess.Popen(
|
||||
[encodeArgument(exe)] + args,
|
||||
stdin=subprocess.PIPE,
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT).communicate()
|
||||
except OSError:
|
||||
return False
|
||||
@@ -2339,11 +2349,18 @@ def _match_one(filter_part, dct):
|
||||
m = operator_rex.search(filter_part)
|
||||
if m:
|
||||
op = COMPARISON_OPERATORS[m.group('op')]
|
||||
if m.group('strval') is not None:
|
||||
actual_value = dct.get(m.group('key'))
|
||||
if (m.group('strval') is not None or
|
||||
# If the original field is a string and matching comparisonvalue is
|
||||
# a number we should respect the origin of the original field
|
||||
# and process comparison value as a string (see
|
||||
# https://github.com/rg3/youtube-dl/issues/11082).
|
||||
actual_value is not None and m.group('intval') is not None and
|
||||
isinstance(actual_value, compat_str)):
|
||||
if m.group('op') not in ('=', '!='):
|
||||
raise ValueError(
|
||||
'Operator %s does not support string values!' % m.group('op'))
|
||||
comparison_value = m.group('strval')
|
||||
comparison_value = m.group('strval') or m.group('intval')
|
||||
else:
|
||||
try:
|
||||
comparison_value = int(m.group('intval'))
|
||||
@@ -2355,7 +2372,6 @@ def _match_one(filter_part, dct):
|
||||
raise ValueError(
|
||||
'Invalid integer value %r in filter part %r' % (
|
||||
m.group('intval'), filter_part))
|
||||
actual_value = dct.get(m.group('key'))
|
||||
if actual_value is None:
|
||||
return m.group('none_inclusive')
|
||||
return op(actual_value, comparison_value)
|
||||
@@ -3017,9 +3033,7 @@ def encode_base_n(num, n, table=None):
|
||||
|
||||
|
||||
def decode_packed_codes(code):
|
||||
mobj = re.search(
|
||||
r"}\('(.+)',(\d+),(\d+),'([^']+)'\.split\('\|'\)",
|
||||
code)
|
||||
mobj = re.search(PACKED_CODES_RE, code)
|
||||
obfucasted_code, base, count, symbols = mobj.groups()
|
||||
base = int(base)
|
||||
count = int(count)
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2016.10.16'
|
||||
__version__ = '2016.11.02'
|
||||
|
||||
Reference in New Issue
Block a user