release 2014.01.23.3

[youtube] Add new formats (Fixes #2221 )
Add build instructions (Fixes #2218 )
2025-12-13 01:22:44 +01:00 · 2014-01-23 23:55:53 +01:00 · 2014-01-23 23:54:06 +01:00 · 2014-01-23 23:28:29 +01:00 · 2014-01-23 23:21:42 +01:00 · 2014-01-23 19:05:05 +01:00
10 changed files with 148 additions and 76 deletions
--- a/README.md
+++ b/README.md
@@ -181,7 +181,9 @@ which means you can modify it, redistribute it or use it however you like.
                                     preference using slashes: "-f 22/17/18".
                                     "-f mp4" and "-f flv" are also supported.
                                     You can also use the special names "best",
-                                     "bestaudio", "worst", and "worstaudio"
+                                     "bestaudio", "worst", and "worstaudio". By
+                                     default, youtube-dl will pick the best
+                                     quality.
    --all-formats                    download all available video formats
    --prefer-free-formats            prefer free video formats unless a specific
                                     one is requested
@@ -323,11 +325,27 @@ Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unz

 To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).

-# COPYRIGHT
+# BUILD INSTRUCTIONS

-youtube-dl is released into the public domain by the copyright holders.
+Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.

-This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
+To run youtube-dl as a developer, you don't need to build anything either. Simply execute
+
+    python -m youtube_dl
+
+To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
+
+    python -m unittest discover
+    python test/test_download.py
+    nosetests
+
+If you want to create a build of youtube-dl yourself, you'll need
+
+* python
+* make
+* pandoc
+* zip
+* nosetests

 # BUGS

@@ -386,3 +404,9 @@ Only post features that you (or an incapicated friend you can personally talk to
 ###  Is your question about youtube-dl?

 It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different or even the reporter's own application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
+
+# COPYRIGHT
+
+youtube-dl is released into the public domain by the copyright holders.
+
+This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -396,10 +396,6 @@ class YoutubeDL(object):
        except UnicodeEncodeError:
            self.to_screen('[download] The file has already been downloaded')

-    def increment_downloads(self):
-        """Increment the ordinal that assigns a number to each file."""
-        self._num_downloads += 1
-
    def prepare_filename(self, info_dict):
        """Generate the output filename."""
        try:
@@ -517,6 +513,8 @@ class YoutubeDL(object):
            except ExtractorError as de: # An error we somewhat expected
                self.report_error(compat_str(de), de.format_traceback())
                break
+            except MaxDownloadsReached:
+                raise
            except Exception as e:
                if self.params.get('ignoreerrors', False):
                    self.report_error(compat_str(e), tb=compat_str(traceback.format_exc()))
@@ -771,8 +769,11 @@ class YoutubeDL(object):
        """Process a single resolved IE result."""

        assert info_dict.get('_type', 'video') == 'video'
-        #We increment the download the download count here to match the previous behaviour.
-        self.increment_downloads()
+
+        max_downloads = self.params.get('max_downloads')
+        if max_downloads is not None:
+            if self._num_downloads >= int(max_downloads):
+                raise MaxDownloadsReached()

        info_dict['fulltitle'] = info_dict['title']
        if len(info_dict['title']) > 200:
@@ -789,10 +790,7 @@ class YoutubeDL(object):
            self.to_screen('[download] ' + reason)
            return

-        max_downloads = self.params.get('max_downloads')
-        if max_downloads is not None:
-            if self._num_downloads > int(max_downloads):
-                raise MaxDownloadsReached()
+        self._num_downloads += 1

        filename = self.prepare_filename(info_dict)

@@ -1096,9 +1094,15 @@ class YoutubeDL(object):
                res += fdict['format_note'] + ' '
            if fdict.get('tbr') is not None:
                res += '%4dk ' % fdict['tbr']
+            if fdict.get('container') is not None:
+                if res:
+                    res += ', '
+                res += '%s container' % fdict['container']
            if (fdict.get('vcodec') is not None and
                    fdict.get('vcodec') != 'none'):
-                res += '%-5s' % fdict['vcodec']
+                if res:
+                    res += ', '
+                res += fdict['vcodec']
                if fdict.get('vbr') is not None:
                    res += '@'
            elif fdict.get('vbr') is not None and fdict.get('abr') is not None:
@@ -1108,7 +1112,10 @@ class YoutubeDL(object):
            if fdict.get('acodec') is not None:
                if res:
                    res += ', '
-                res += '%-5s' % fdict['acodec']
+                if fdict['acodec'] == 'none':
+                    res += 'video only'
+                else:
+                    res += '%-5s' % fdict['acodec']
            elif fdict.get('abr') is not None:
                if res:
                    res += ', '
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -261,7 +261,7 @@ def parseOpts(overrideArguments=None):

    video_format.add_option('-f', '--format',
            action='store', dest='format', metavar='FORMAT', default=None,
-            help='video format code, specify the order of preference using slashes: "-f 22/17/18". "-f mp4" and "-f flv" are also supported. You can also use the special names "best", "bestaudio", "worst", and "worstaudio"')
+            help='video format code, specify the order of preference using slashes: "-f 22/17/18". "-f mp4" and "-f flv" are also supported. You can also use the special names "best", "bestaudio", "worst", and "worstaudio". By default, youtube-dl will pick the best quality.')
    video_format.add_option('--all-formats',
            action='store_const', dest='format', help='download all available video formats', const='all')
    video_format.add_option('--prefer-free-formats',
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -66,6 +66,7 @@ class InfoExtractor(object):
                    * asr        Audio sampling rate in Hertz
                    * vbr        Average video bitrate in KBit/s
                    * vcodec     Name of the video codec in use
+                    * container  Name of the container format
                    * filesize   The number of bytes, if known in advance
                    * player_url SWF Player URL (used for rtmpdump).
                    * protocol   The protocol that will be used for the actual
--- a/youtube_dl/extractor/rottentomatoes.py
+++ b/youtube_dl/extractor/rottentomatoes.py
@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+
 from .videodetective import VideoDetectiveIE


@@ -7,10 +9,10 @@ class RottenTomatoesIE(VideoDetectiveIE):
    _VALID_URL = r'https?://www\.rottentomatoes\.com/m/[^/]+/trailers/(?P<id>\d+)'

    _TEST = {
-        u'url': u'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
-        u'file': '613340.mp4',
-        u'info_dict': {
-            u'title': u'TOY STORY 3',
-            u'description': u'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
+        'url': 'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
+        'file': '613340.mp4',
+        'info_dict': {
+            'title': 'TOY STORY 3',
+            'description': 'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
        },
    }
--- a/youtube_dl/extractor/sina.py
+++ b/youtube_dl/extractor/sina.py
@@ -1,4 +1,5 @@
 # coding: utf-8
+from __future__ import unicode_literals

 import re

@@ -12,21 +13,31 @@ from ..utils import (
 class SinaIE(InfoExtractor):
    _VALID_URL = r'''https?://(.*?\.)?video\.sina\.com\.cn/
                        (
-                            (.+?/(((?P<pseudo_id>\d+).html)|(.*?(\#|(vid=))(?P<id>\d+?)($|&))))
+                            (.+?/(((?P<pseudo_id>\d+).html)|(.*?(\#|(vid=)|b/)(?P<id>\d+?)($|&|\-))))
                            |
                            # This is used by external sites like Weibo
                            (api/sinawebApi/outplay.php/(?P<token>.+?)\.swf)
                        )
                  '''

-    _TEST = {
-        u'url': u'http://video.sina.com.cn/news/vlist/zt/chczlj2013/?opsubject_id=top12#110028898',
-        u'file': u'110028898.flv',
-        u'md5': u'd65dd22ddcf44e38ce2bf58a10c3e71f',
-        u'info_dict': {
-            u'title': u'《中国新闻》 朝鲜要求巴拿马立即释放被扣船员',
-        }
-    }
+    _TESTS = [
+        {
+            'url': 'http://video.sina.com.cn/news/vlist/zt/chczlj2013/?opsubject_id=top12#110028898',
+            'file': '110028898.flv',
+            'md5': 'd65dd22ddcf44e38ce2bf58a10c3e71f',
+            'info_dict': {
+                'title': '《中国新闻》 朝鲜要求巴拿马立即释放被扣船员',
+            }
+        },
+        {
+            'url': 'http://video.sina.com.cn/v/b/101314253-1290078633.html',
+            'info_dict': {
+                'id': '101314253',
+                'ext': 'flv',
+                'title': '军方提高对朝情报监视级别',
+            },
+        },
+    ]

    @classmethod
    def suitable(cls, url):
@@ -35,10 +46,10 @@ class SinaIE(InfoExtractor):
    def _extract_video(self, video_id):
        data = compat_urllib_parse.urlencode({'vid': video_id})
        url_doc = self._download_xml('http://v.iask.com/v_play.php?%s' % data,
-            video_id, u'Downloading video url')
+            video_id, 'Downloading video url')
        image_page = self._download_webpage(
            'http://interface.video.sina.com.cn/interface/common/getVideoImage.php?%s' % data,
-            video_id, u'Downloading thumbnail info')
+            video_id, 'Downloading thumbnail info')

        return {'id': video_id,
                'url': url_doc.find('./durl/url').text,
@@ -52,7 +63,7 @@ class SinaIE(InfoExtractor):
        video_id = mobj.group('id')
        if mobj.group('token') is not None:
            # The video id is in the redirected url
-            self.to_screen(u'Getting video id')
+            self.to_screen('Getting video id')
            request = compat_urllib_request.Request(url)
            request.get_method = lambda: 'HEAD'
            (_, urlh) = self._download_webpage_handle(request, 'NA', False)
@@ -60,6 +71,6 @@ class SinaIE(InfoExtractor):
        elif video_id is None:
            pseudo_id = mobj.group('pseudo_id')
            webpage = self._download_webpage(url, pseudo_id)
-            video_id = self._search_regex(r'vid:\'(\d+?)\'', webpage, u'video id')
+            video_id = self._search_regex(r'vid:\'(\d+?)\'', webpage, 'video id')

        return self._extract_video(video_id)
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dl/extractor/xhamster.py
@@ -1,10 +1,11 @@
+from __future__ import unicode_literals
+
 import re

 from .common import InfoExtractor
 from ..utils import (
    compat_urllib_parse,
    unescapeHTML,
-    determine_ext,
    ExtractorError,
 )

@@ -13,25 +14,25 @@ class XHamsterIE(InfoExtractor):
    """Information Extractor for xHamster"""
    _VALID_URL = r'(?:http://)?(?:www\.)?xhamster\.com/movies/(?P<id>[0-9]+)/(?P<seo>.+?)\.html(?:\?.*)?'
    _TESTS = [{
-        u'url': u'http://xhamster.com/movies/1509445/femaleagent_shy_beauty_takes_the_bait.html',
-        u'file': u'1509445.flv',
-        u'md5': u'9f48e0e8d58e3076bb236ff412ab62fa',
-        u'info_dict': {
-            u"upload_date": u"20121014", 
-            u"uploader_id": u"Ruseful2011", 
-            u"title": u"FemaleAgent Shy beauty takes the bait",
-            u"age_limit": 18,
+        'url': 'http://xhamster.com/movies/1509445/femaleagent_shy_beauty_takes_the_bait.html',
+        'file': '1509445.mp4',
+        'md5': '8281348b8d3c53d39fffb377d24eac4e',
+        'info_dict': {
+            "upload_date": "20121014",
+            "uploader_id": "Ruseful2011",
+            "title": "FemaleAgent Shy beauty takes the bait",
+            "age_limit": 18,
        }
    },
    {
-        u'url': u'http://xhamster.com/movies/2221348/britney_spears_sexy_booty.html?hd',
-        u'file': u'2221348.flv',
-        u'md5': u'e767b9475de189320f691f49c679c4c7',
-        u'info_dict': {
-            u"upload_date": u"20130914",
-            u"uploader_id": u"jojo747400",
-            u"title": u"Britney Spears  Sexy Booty",
-            u"age_limit": 18,
+        'url': 'http://xhamster.com/movies/2221348/britney_spears_sexy_booty.html?hd',
+        'file': '2221348.flv',
+        'md5': 'e767b9475de189320f691f49c679c4c7',
+        'info_dict': {
+            "upload_date": "20130914",
+            "uploader_id": "jojo747400",
+            "title": "Britney Spears  Sexy Booty",
+            "age_limit": 18,
        }
    }]

@@ -39,14 +40,21 @@ class XHamsterIE(InfoExtractor):
        def extract_video_url(webpage):
            mobj = re.search(r'\'srv\': \'(?P<server>[^\']*)\',\s*\'file\': \'(?P<file>[^\']+)\',', webpage)
            if mobj is None:
-                raise ExtractorError(u'Unable to extract media URL')
+                raise ExtractorError('Unable to extract media URL')
            if len(mobj.group('server')) == 0:
                return compat_urllib_parse.unquote(mobj.group('file'))
            else:
                return mobj.group('server')+'/key='+mobj.group('file')

+        def extract_mp4_video_url(webpage):
+            mp4 = re.search(r'<a href=\"(.+?)\" class=\"mp4Play\"',webpage)
+            if mp4 is None:
+                return None
+            else:
+                return mp4.group(1)
+
        def is_hd(webpage):
-            return webpage.find('<div class=\'icon iconHD\'') != -1
+            return '<div class=\'icon iconHD\'' in webpage

        mobj = re.match(self._VALID_URL, url)

@@ -55,50 +63,60 @@ class XHamsterIE(InfoExtractor):
        mrss_url = 'http://xhamster.com/movies/%s/%s.html' % (video_id, seo)
        webpage = self._download_webpage(mrss_url, video_id)

-        video_title = self._html_search_regex(r'<title>(?P<title>.+?) - xHamster\.com</title>',
-            webpage, u'title')
+        video_title = self._html_search_regex(
+            r'<title>(?P<title>.+?) - xHamster\.com</title>', webpage, 'title')

        # Only a few videos have an description
-        mobj = re.search('<span>Description: </span>(?P<description>[^<]+)', webpage)
-        if mobj:
-            video_description = unescapeHTML(mobj.group('description'))
-        else:
-            video_description = None
+        mobj = re.search(r'<span>Description: </span>([^<]+)', webpage)
+        video_description = mobj.group(1) if mobj else None

        mobj = re.search(r'hint=\'(?P<upload_date_Y>[0-9]{4})-(?P<upload_date_m>[0-9]{2})-(?P<upload_date_d>[0-9]{2}) [0-9]{2}:[0-9]{2}:[0-9]{2} [A-Z]{3,4}\'', webpage)
        if mobj:
            video_upload_date = mobj.group('upload_date_Y')+mobj.group('upload_date_m')+mobj.group('upload_date_d')
        else:
            video_upload_date = None
-            self._downloader.report_warning(u'Unable to extract upload date')
+            self._downloader.report_warning('Unable to extract upload date')

-        video_uploader_id = self._html_search_regex(r'<a href=\'/user/[^>]+>(?P<uploader_id>[^<]+)',
-            webpage, u'uploader id', default=u'anonymous')
+        video_uploader_id = self._html_search_regex(
+            r'<a href=\'/user/[^>]+>(?P<uploader_id>[^<]+)',
+            webpage, 'uploader id', default='anonymous')

-        video_thumbnail = self._search_regex(r'\'image\':\'(?P<thumbnail>[^\']+)\'',
-            webpage, u'thumbnail', fatal=False)
+        video_thumbnail = self._search_regex(
+            r'\'image\':\'(?P<thumbnail>[^\']+)\'',
+            webpage, 'thumbnail', fatal=False)

        age_limit = self._rta_search(webpage)

-        video_url = extract_video_url(webpage)
        hd = is_hd(webpage)
+        video_url = extract_video_url(webpage)
        formats = [{
            'url': video_url,
-            'ext': determine_ext(video_url),
-            'format': 'hd' if hd else 'sd',
            'format_id': 'hd' if hd else 'sd',
+            'preference': 0,
        }]
+
+        video_mp4_url = extract_mp4_video_url(webpage)
+        if video_mp4_url is not None:
+            formats.append({
+                'url': video_mp4_url,
+                'ext': 'mp4',
+                'format_id': 'mp4-hd' if hd else 'mp4-sd',
+                'preference': 1,
+            })
+
        if not hd:
-            webpage = self._download_webpage(mrss_url+'?hd', video_id)
+            webpage = self._download_webpage(
+                mrss_url + '?hd', video_id, note='Downloading HD webpage')
            if is_hd(webpage):
                video_url = extract_video_url(webpage)
                formats.append({
                    'url': video_url,
-                    'ext': determine_ext(video_url),
-                    'format': 'hd',
                    'format_id': 'hd',
+                    'preference': 2,
                })

+        self._sort_formats(formats)
+
        return {
            'id': video_id,
            'title': video_title,
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -207,6 +207,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
        '141': {'ext': 'm4a', 'format_note': 'DASH audio', 'vcodec': 'none', 'abr': 256, 'preference': -50},

        # Dash webm
+        '167': {'ext': 'webm', 'height': 360, 'width': 640, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'VP8', 'acodec': 'none', 'preference': -40},
+        '168': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'VP8', 'acodec': 'none', 'preference': -40},
+        '168': {'ext': 'webm', 'height': 1080, 'width': 1920, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'VP8', 'acodec': 'none', 'preference': -40},
+        '218': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'VP8', 'acodec': 'none', 'preference': -40},
+        '219': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'VP8', 'acodec': 'none', 'preference': -40},
        '242': {'ext': 'webm', 'height': 240, 'resolution': '240p', 'format_note': 'DASH webm', 'preference': -40},
        '243': {'ext': 'webm', 'height': 360, 'resolution': '360p', 'format_note': 'DASH webm', 'preference': -40},
        '244': {'ext': 'webm', 'height': 480, 'resolution': '480p', 'format_note': 'DASH webm', 'preference': -40},
@@ -1290,7 +1295,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
                    'url': video_real_url,
                    'player_url': player_url,
                }
-                dct.update(self._formats[itag])
+                if itag in self._formats:
+                    dct.update(self._formats[itag])
                formats.append(dct)
            return formats

@@ -1805,7 +1811,10 @@ class YoutubeFavouritesIE(YoutubeBaseInfoExtractor):
 class YoutubeTruncatedURLIE(InfoExtractor):
    IE_NAME = 'youtube:truncated_url'
    IE_DESC = False  # Do not list
-    _VALID_URL = r'(?:https?://)?[^/]+/watch\?feature=[a-z_]+$'
+    _VALID_URL = r'''(?x)
+        (?:https?://)?[^/]+/watch\?feature=[a-z_]+$|
+        (?:https?://)?(?:www\.)?youtube\.com/attribution_link\?a=[^&]+$
+    '''

    def _real_extract(self, url):
        raise ExtractorError(
--- a/youtube_dl/update.py
+++ b/youtube_dl/update.py
@@ -90,7 +90,7 @@ def update_self(to_screen, verbose):
        to_screen(u'youtube-dl is up to date (%s)' % __version__)
        return

-    to_screen(u'Updating to version ' + version_id + '...')
+    to_screen(u'Updating to version ' + version_id + ' ...')
    version = versions_info['versions'][version_id]

    print_notes(to_screen, versions_info['versions'])
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,2 +1,2 @@

-__version__ = '2014.01.23'
+__version__ = '2014.01.23.3'
Author	SHA1	Message	Date
Philipp Hagemeister	f265fc1238	release 2014.01.23.3	2014-01-23 23:55:53 +01:00
Philipp Hagemeister	1394ce65b4	[youtube] Add new formats (Fixes #2221 )	2014-01-23 23:54:06 +01:00
Philipp Hagemeister	63ef36e8d8	Add build instructions (Fixes #2218 )	2014-01-23 23:28:29 +01:00
Philipp Hagemeister	0b65e5d40f	[youtube] Do not break upon unknown formats	2014-01-23 23:21:42 +01:00
Philipp Hagemeister	629be17af4	release 2014.01.23.2	2014-01-23 19:05:05 +01:00
Philipp Hagemeister	fd28827864	Do not count unmatched videos for --max-downloads (Fixes #2211 )	2014-01-23 19:04:22 +01:00
Philipp Hagemeister	8c61d9a9b1	Mention default for -f (Fixes #2215 )	2014-01-23 18:50:04 +01:00
Philipp Hagemeister	975d35dbab	[youtube:truncated_url] Also match mail subscription links (#2214 )	2014-01-23 16:14:54 +01:00
Jaime Marquínez Ferrándiz	8b769664c4	[sina] Recognize http://video.sina.com.cn/v/b/{id}-*.html urls (fixes #2212 )	2014-01-23 14:03:14 +01:00
Jaime Marquínez Ferrándiz	76f270a46a	[sina] use unicode_literals	2014-01-23 14:00:29 +01:00
Philipp Hagemeister	9dab1b7f28	release 2014.01.23.1	2014-01-23 10:37:34 +01:00
Philipp Hagemeister	d3e5bbf437	Correct --max-downloads with --ignore-errors	2014-01-23 10:36:47 +01:00
Philipp Hagemeister	18a25c5d78	Clarify update output (Fixes #2205 ) No, we are not intentionally hiding the version number. Why would we?	2014-01-23 10:24:44 +01:00
Philipp Hagemeister	924f47f7b6	[rottentomatoes] Use unicode_literals	2014-01-23 04:05:58 +01:00
Philipp Hagemeister	22ff1c4a93	[xhamster] Futher simplification	2014-01-23 04:04:39 +01:00
Philipp Hagemeister	35409e1101	[xhamster] Use unicode_literals	2014-01-23 03:52:59 +01:00
Mike Col	65d781128a	[xhamster] Add support for hd video Signed-off-by: Philipp Hagemeister <phihag@phihag.de>	2014-01-23 03:51:09 +01:00