Compare commits

...

67 Commits

Author SHA1 Message Date
pukkandan
ba9f36d732 Release 2021.02.09 2021-02-10 01:26:56 +05:30
pukkandan
cffab0eefc [embedsubtitle] Keep original subtitle after conversion if write_subtitles given
Closes: https://github.com/pukkandan/yt-dlp/issues/57#issuecomment-775227745

:ci skip dl
2021-02-10 00:12:42 +05:30
pukkandan
2e339f59c3 [embedthumbnail] Keep original thumbnail after conversion if write_thumbnail given (Closes #67)
Closes https://github.com/ytdl-org/youtube-dl/issues/27041

:ci skip dl
2021-02-09 23:18:20 +05:30
pukkandan
6c4fd172de Add fallback for thumbnails
Workaround for: https://github.com/ytdl-org/youtube-dl/issues/28023
Related: https://github.com/ytdl-org/youtube-dl/pull/28031

Also fixes https://www.reddit.com/r/youtubedl/comments/lfslw1/youtubedlp_with_aria2c_for_dash_support_is/gmolt0r?context=3
2021-02-09 23:12:41 +05:30
pukkandan
deaec5afc2 [youtube] Fix tests 2021-02-09 22:01:34 +05:30
pukkandan
69184e4152 [youtube] Simplified renderer parsing 2021-02-09 21:37:59 +05:30
pukkandan
a1b535bd75 [youtube] Support gridPlaylistRenderer and gridVideoRenderer (Closes #65) 2021-02-09 20:40:37 +05:30
pukkandan
b3943b2f33 [pyinst.py] Move back to root dir (Closes #63) 2021-02-09 18:04:27 +05:30
shirt-dev
3dd264bf42 #64 Implement self updater
Co-authored-by: shirtjs <2660574+shirtjs@users.noreply.github.com> (shirt-dev)
Co-authored-by: pukkandan <pukkandan@gmail.com>
2021-02-09 18:04:00 +05:30
pukkandan
efabc16165 [postprocessor] Fix bug (Closes #62)
introduced by: 1bf540d28b

:ci skip dl
2021-02-09 00:27:39 +05:30
shirt-dev
5219cb3e75 #55 Add aria2c support for DASH (mpd) and HLS (m3u8)
Co-authored-by: Dan <2660574+shirtjs@users.noreply.github.com>
Co-authored-by: pukkandan <pukkandan@gmail.com>
2021-02-08 22:16:01 +05:30
pukkandan
ff84930c86 [youtube] Bugfix (Closes #60) 2021-02-08 19:20:19 +05:30
pukkandan
06ff212d64 [documentation] Crypto is an optional dependency 2021-02-08 18:05:22 +05:30
pukkandan
1bf540d28b [sponskrub] Don't raise error when the video does not exist
Eg: `--convert-sub srt --no-download --sponskrub` gave error before

:ci skip dl
2021-02-08 15:48:12 +05:30
pukkandan
df692c5a7a [remuxvideo] Fix validation of conditional remux 2021-02-08 15:29:02 +05:30
pukkandan
ecc97af344 [youtube] Don't show warning for empty playlist description (Closes #54)
:ci skip dl
2021-02-07 20:15:02 +05:30
pukkandan
8a0b932258 [movefiles] Fix compatibility with python2
:ci skip dl
2021-02-07 17:41:41 +05:30
pukkandan
4d608b522f [youtube_live_chat] Improve extraction
:ci skip dl
2021-02-07 15:22:36 +05:30
pukkandan
885d36d4e4 [youtube] Fix comment extraction (Closes #53)
:ci skip dl
2021-02-05 16:47:44 +05:30
pukkandan
0fd1a2b0bf [version] update (and linter) 2021-02-05 05:02:41 +05:30
pukkandan
c25228e5da Release 2021.02.04 2021-02-05 04:50:38 +05:30
pukkandan
de6000d913 Multiple output templates for different file types
Syntax: -o common_template -o type:type_template
Types supported: subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson
2021-02-05 04:11:39 +05:30
pukkandan
ff88a05cff [pyinst] Automatically detect python architecture and working directory
:ci skip all
2021-02-04 22:09:10 +05:30
pukkandan
8a784c74d1 [linter] youtube.py 2021-02-04 20:29:25 +05:30
pukkandan
545cc85d11 [youtube] Update to ytdl-2021.02.04.1 2021-02-04 20:07:17 +05:30
pukkandan
c10d0213fc [FormatSort] fix bug where quality had more priority than hasvid 2021-02-04 19:42:14 +05:30
pukkandan
2181983a0c Update to ytdl-2021.02.04.1 except youtube 2021-02-04 13:26:22 +05:30
pukkandan
e29663c644 #45 Allow date/time formatting in output template
Closes #43
2021-02-03 02:45:00 +05:30
pukkandan
9c3fe2ef80 [youtube_live_chat] Fix URL
Bug introduced by 82e3f6ebda

:ci skip dl
2021-02-03 02:22:27 +05:30
pukkandan
b60419c51a [youtube] More metadata extraction for channels/playlists 2021-02-02 21:51:32 +05:30
pukkandan
18590cecdb Strip out internal fields such as _filename from infojson (Closes #42)
:ci skip dl
2021-02-02 03:19:21 +05:30
pukkandan
9f888147de [FormatSort] Allow user to prefer av01 over vp9
The default is still vp9
2021-02-02 03:19:21 +05:30
pukkandan
e8be92f9d6 Fix "Default format spec" appearing in quiet mode 2021-02-02 03:19:21 +05:30
pukkandan
b9d973bef1 Fix issue with overwriting files 2021-02-02 03:19:21 +05:30
pukkandan
c55256c5a3 [audius] Fix extractor 2021-02-01 15:03:59 +05:30
pukkandan
82e3f6ebda [youtube_live_chat] Fix parse_yt_initial_data and add fragment_retries
:ci skip dl
2021-01-31 20:52:43 +05:30
pukkandan
af819c216f [postprocessor] Raise errors correctly
Previously, when a postprocessor reported error, the download was still considered a success. This causes issues especially with critical PPs like Merger, MoveFiles etc

:ci skip dl
2021-01-30 18:07:21 +05:30
pukkandan
e3b771a898 fix typos :ci skip dl 2021-01-30 16:49:58 +05:30
pukkandan
cac96421d9 New option --no-write-playlist-metafiles to NOT write playlist metadata files 2021-01-30 16:43:20 +05:30
pukkandan
7c245ce877 [metadatafromtitle] Fix bug when extracting data from numeric fields
:ci skip dl
2021-01-30 14:36:10 +05:30
pukkandan
eabce90175 [version] update
:ci skip dl
2021-01-29 23:42:28 +05:30
pukkandan
29b6000e35 Release 2021.01.29 2021-01-29 23:25:18 +05:30
pukkandan
e38df8f9fa Refactor update-version, pyinst.py and related files
* Refactor update-version
* Moved pyinst, update-version and icon into devscripts
* pyinst doesn't bump version anymore
* Merge pyinst and pyinst32. Usage: `pyinst.py [32|64]`
* Add mutagen as requirement
* Remove make_win and related files
2021-01-29 23:16:00 +05:30
pukkandan
caa15a7b57 [Audius] Add extractor (Closes #40)
Related: https://github.com/ytdl-org/youtube-dl/pull/27360
Related: https://github.com/ytdl-org/youtube-dl/issues/24216

Direct API URLs are not currently supported. See https://github.com/ytdl-org/youtube-dl/pull/27360#issuecomment-757123708 for details

Co-authored by: qulas
2021-01-29 22:30:22 +05:30
pukkandan
105b0b700e Populate "playlist_*" fields for setting playlist metadata filename
Related: #36
2021-01-29 01:57:14 +05:30
pukkandan
66c935fb16 Linter and misc cleanup
:ci skip dl
2021-01-29 01:03:32 +05:30
pukkandan
64c0d954e5 [youtube] Extract playlist description 2021-01-29 00:31:50 +05:30
pukkandan
bf330f5f29 [anvato] Workaround for anvato_token_generator import failing (Closes #35)
:ci skip dl
2021-01-28 15:57:37 +05:30
pukkandan
f6d7624f57 Partial solution for detecting existing files correctly even when extracting audio
* Does not work when audio format is 'best'
2021-01-28 15:50:03 +05:30
pukkandan
ece8a2a1b6 [embedthumbnail] Fix for missing output filename for ffmpeg call (Closes #38) 2021-01-28 15:48:33 +05:30
Bepis
8d0ea5f955 [Youtube] Improve comment API requests
co-authored by bbepis
2021-01-28 11:49:31 +05:30
pukkandan
0748b3317b Seperate import of lazy_extractors from that of normal extractors
This prevents "ModuleNotFoundError: No module named 'youtube_dl.extractor.lazy_extractors'" from appearing in the traceback

Related: https://github.com/animelover1984/youtube-dl/issues/17#issuecomment-757945024
2021-01-28 11:25:42 +05:30
pukkandan
6b591b2925 Detect existing files correctly even when there is remux/recode
:ci skip dl
2021-01-28 10:49:37 +05:30
pukkandan
179122495b [ffmpeg] Document more formats that are supported for remux/recode 2021-01-28 10:36:34 +05:30
pukkandan
02fd60d305 Write playlist description to file (Closes #36)
:ci skip dl
2021-01-28 06:25:18 +05:30
pukkandan
06167fbbd3 #31 Features from animelover1984/youtube-dl
* Add `--get-comments`
* [youtube] Extract comments
* [billibilli] Added BiliBiliSearchIE, BilibiliChannelIE
* [billibilli] Extract comments
* [billibilli] Better video extraction
* Write playlist data to infojson
* [FFmpegMetadata] Embed infojson inside the video
* [EmbedThumbnail] Try embedding in mp4 using ffprobe and `-disposition`
* [EmbedThumbnail] Treat mka like mkv and mov like mp4
* [EmbedThumbnail] Embed in ogg/opus
* [VideoRemuxer] Conditionally remux video
* [VideoRemuxer] Add `-movflags +faststart` when remuxing from mp4
* [ffmpeg] Print entire stderr in verbose when there is error
* [EmbedSubtitle] Warn when embedding ass in mp4
* [avanto] Use NFLTokenGenerator if possible
2021-01-27 20:32:51 +05:30
pukkandan
4ff5e98991 More badges
:ci skip all
2021-01-27 20:16:34 +05:30
pukkandan
e4172ac903 Deprecate avconv/avprobe
All current functionality is left untouched. But don't expect any new features to work with avconv

:ci skip all
2021-01-26 23:27:32 +05:30
pukkandan
5bfa486205 Add option --parse-metadata
* The fields extracted by this can be used in `--output`
* Deprecated `--metadata-from-title`

:ci skip dl
2021-01-26 16:14:31 +05:30
pukkandan
9882064024 [movefiles] Don't give "cant find" warning when move is unnecessary 2021-01-26 15:53:32 +05:30
pukkandan
2d6921210d [postprocessor] fix write_debug when no _downloader 2021-01-26 15:53:22 +05:30
pukkandan
f137c99e9f Fix some fields not sorting correctly
bug introduced by: 63be1aab2f
2021-01-25 19:28:39 +05:30
pukkandan
6b8eb0c024 Report error message from youtube as error (Closes #33)
:ci skip dl
2021-01-25 10:26:51 +05:30
pukkandan
5b328c97d7 Changed revision number to use '.' instead of '-'
and refactor it

:ci skip dl
2021-01-25 02:25:05 +05:30
pukkandan
b5d265633d Fix wrong user config (Closes #32)
:ci skip dl
2021-01-25 01:52:47 +05:30
pukkandan
a392adf56c [version] update
:ci skip dl
2021-01-24 21:51:50 +05:30
pukkandan
0bc0a32290 Release 2021.01.24 2021-01-24 21:39:55 +05:30
81 changed files with 3416 additions and 3246 deletions

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.20. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.02.04. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running yt-dlp version **2021.01.20**
- [ ] I've verified that I'm running yt-dlp version **2021.02.04**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@@ -44,7 +44,7 @@ Add the `-v` flag to your command line you run youtube-dlc with (`youtube-dlc -v
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.01.20
[debug] yt-dlp version 2021.02.04
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.20. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.02.04. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/pukkandan/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running yt-dlp version **2021.01.20**
- [ ] I've verified that I'm running yt-dlp version **2021.02.04**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.20. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.02.04. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running yt-dlp version **2021.01.20**
- [ ] I've verified that I'm running yt-dlp version **2021.02.04**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.20. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.02.04. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -30,7 +30,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running yt-dlp version **2021.01.20**
- [ ] I've verified that I'm running yt-dlp version **2021.02.04**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -46,7 +46,7 @@ Add the `-v` flag to your command line you run youtube-dlc with (`youtube-dlc -v
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.01.20
[debug] yt-dlp version 2021.02.04
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.20. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.02.04. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running yt-dlp version **2021.01.20**
- [ ] I've verified that I'm running yt-dlp version **2021.02.04**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -25,8 +25,8 @@ jobs:
run: sudo apt-get -y install zip pandoc man
- name: Bump version
id: bump_version
run: python scripts/update-version-workflow.py
- name: Check the output from My action
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run Make
run: make
@@ -84,11 +84,14 @@ jobs:
with:
python-version: '3.8'
- name: Install Requirements
run: pip install pyinstaller
run: pip install pyinstaller mutagen Crypto
- name: Bump version
run: python scripts/update-version-workflow.py
id: bump_version
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run PyInstaller Script
run: python pyinst.py
run: python pyinst.py 64
- name: Upload youtube-dlc.exe Windows binary
id: upload-release-windows
uses: actions/upload-release-asset@v1
@@ -119,11 +122,14 @@ jobs:
python-version: '3.4.4'
architecture: 'x86'
- name: Install Requirements for 32 Bit
run: pip install pyinstaller==3.5
run: pip install pyinstaller==3.5 mutagen Crypto
- name: Bump version
run: python scripts/update-version-workflow.py
id: bump_version
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run PyInstaller Script for 32 Bit
run: python pyinst32.py
run: python pyinst.py 32
- name: Upload Executable youtube-dlc_x86.exe
id: upload-release-windows32
uses: actions/upload-release-asset@v1
@@ -162,18 +168,15 @@ jobs:
asset_name: SHA2-256SUMS
asset_content_type: text/plain
update_version_badge:
runs-on: ubuntu-latest
needs: build_unix
steps:
- name: Create Version Badge
uses: schneegans/dynamic-badges-action@v1.0.0
with:
auth: ${{ secrets.GIST_TOKEN }}
gistID: c69cb23c3c5b3316248e52022790aa57
filename: version.json
label: Version
message: ${{ needs.build_unix.outputs.ytdlc_version }}
# update_version_badge:
# runs-on: ubuntu-latest
# needs: build_unix
# steps:
# - name: Create Version Badge
# uses: schneegans/dynamic-badges-action@v1.0.0
# with:
# auth: ${{ secrets.GIST_TOKEN }}
# gistID: c69cb23c3c5b3316248e52022790aa57
# filename: version.json
# label: Version
# message: ${{ needs.build_unix.outputs.ytdlc_version }}

View File

@@ -2,7 +2,7 @@ name: Quick Test
on: [push, pull_request]
jobs:
tests:
name: Core Tests
name: Core Test
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:

95
.gitignore vendored
View File

@@ -1,35 +1,44 @@
# Python
*.pyc
*.pyo
*.class
*~
*.DS_Store
wine-py2exe/
py2exe.log
*.kate-swp
build/
dist/
zip/
tmp/
venv/
# Misc
*~
*.DS_Store
*.kate-swp
MANIFEST
README.txt
youtube-dl.1
youtube-dlc.1
youtube-dl.bash-completion
youtube-dlc.bash-completion
youtube-dl.fish
youtube-dlc.fish
youtube_dl/extractor/lazy_extractors.py
youtube_dlc/extractor/lazy_extractors.py
youtube-dl
youtube-dlc
youtube-dl.exe
youtube-dlc.exe
youtube-dl.tar.gz
youtube-dlc.tar.gz
youtube-dlc.spec
test/local_parameters.json
.coverage
cover/
secrets/
updates_key.pem
*.egg-info
.tox
*.class
# Generated
README.txt
*.1
*.bash-completion
*.fish
*.exe
*.tar.gz
*.zsh
*.spec
# Binary
youtube-dl
youtube-dlc
*.exe
# Downloaded
*.srt
*.ttml
*.sbv
@@ -46,32 +55,36 @@ updates_key.pem
*.swf
*.part
*.ytdl
*.conf
*.swp
*.ogg
*.opus
*.info.json
*.live_chat.json
*.jpg
*.png
*.webp
*.annotations.xml
*.description
# Config
*.conf
*.spec
*.exe
test/local_parameters.json
.tox
youtube-dl.zsh
youtube-dlc.zsh
# IntelliJ related files
.idea
*.iml
tmp/
venv/
# VS Code related files
.vscode
# SublimeText files
*.sublime-workspace
# Cookies
cookies
cookies.txt
# Text Editor / IDE
.idea
*.iml
.vscode
*.sublime-workspace
*.sublime-project
!yt-dlp.sublime-project
# Lazy extractors
*/extractor/lazy_extractors.py
# Plugins
ytdlp_plugins/extractor/*
!ytdlp_plugins/extractor/__init__.py

248
AUTHORS
View File

@@ -1,248 +0,0 @@
Ricardo Garcia Gonzalez
Danny Colligan
Benjamin Johnson
Vasyl' Vavrychuk
Witold Baryluk
Paweł Paprota
Gergely Imreh
Rogério Brito
Philipp Hagemeister
Sören Schulze
Kevin Ngo
Ori Avtalion
shizeeg
Filippo Valsorda
Christian Albrecht
Dave Vasilevsky
Jaime Marquínez Ferrándiz
Jeff Crouse
Osama Khalid
Michael Walter
M. Yasoob Ullah Khalid
Julien Fraichard
Johny Mo Swag
Axel Noack
Albert Kim
Pierre Rudloff
Huarong Huo
Ismael Mejía
Steffan Donal
Andras Elso
Jelle van der Waa
Marcin Cieślak
Anton Larionov
Takuya Tsuchida
Sergey M.
Michael Orlitzky
Chris Gahan
Saimadhav Heblikar
Mike Col
Oleg Prutz
pulpe
Andreas Schmitz
Michael Kaiser
Niklas Laxström
David Triendl
Anthony Weems
David Wagner
Juan C. Olivares
Mattias Harrysson
phaer
Sainyam Kapoor
Nicolas Évrard
Jason Normore
Hoje Lee
Adam Thalhammer
Georg Jähnig
Ralf Haring
Koki Takahashi
Ariset Llerena
Adam Malcontenti-Wilson
Tobias Bell
Naglis Jonaitis
Charles Chen
Hassaan Ali
Dobrosław Żybort
David Fabijan
Sebastian Haas
Alexander Kirk
Erik Johnson
Keith Beckman
Ole Ernst
Aaron McDaniel (mcd1992)
Magnus Kolstad
Hari Padmanaban
Carlos Ramos
5moufl
lenaten
Dennis Scheiba
Damon Timm
winwon
Xavier Beynon
Gabriel Schubiner
xantares
Jan Matějka
Mauroy Sébastien
William Sewell
Dao Hoang Son
Oskar Jauch
Matthew Rayfield
t0mm0
Tithen-Firion
Zack Fernandes
cryptonaut
Adrian Kretz
Mathias Rav
Petr Kutalek
Will Glynn
Max Reimann
Cédric Luthi
Thijs Vermeir
Joel Leclerc
Christopher Krooss
Ondřej Caletka
Dinesh S
Johan K. Jensen
Yen Chi Hsuan
Enam Mijbah Noor
David Luhmer
Shaya Goldberg
Paul Hartmann
Frans de Jonge
Robin de Rooij
Ryan Schmidt
Leslie P. Polzer
Duncan Keall
Alexander Mamay
Devin J. Pohly
Eduardo Ferro Aldama
Jeff Buchbinder
Amish Bhadeshia
Joram Schrijver
Will W.
Mohammad Teimori Pabandi
Roman Le Négrate
Matthias Küch
Julian Richen
Ping O.
Mister Hat
Peter Ding
jackyzy823
George Brighton
Remita Amine
Aurélio A. Heckert
Bernhard Minks
sceext
Zach Bruggeman
Tjark Saul
slangangular
Behrouz Abbasi
ngld
nyuszika7h
Shaun Walbridge
Lee Jenkins
Anssi Hannula
Lukáš Lalinský
Qijiang Fan
Rémy Léone
Marco Ferragina
reiv
Muratcan Simsek
Evan Lu
flatgreen
Brian Foley
Vignesh Venkat
Tom Gijselinck
Founder Fang
Andrew Alexeyew
Saso Bezlaj
Erwin de Haan
Jens Wille
Robin Houtevelts
Patrick Griffis
Aidan Rowe
mutantmonkey
Ben Congdon
Kacper Michajłow
José Joaquín Atria
Viťas Strádal
Kagami Hiiragi
Philip Huppert
blahgeek
Kevin Deldycke
inondle
Tomáš Čech
Déstin Reed
Roman Tsiupa
Artur Krysiak
Jakub Adam Wieczorek
Aleksandar Topuzović
Nehal Patel
Rob van Bekkum
Petr Zvoníček
Pratyush Singh
Aleksander Nitecki
Sebastian Blunt
Matěj Cepl
Xie Yanbo
Philip Xu
John Hawkinson
Rich Leeper
Zhong Jianxin
Thor77
Mattias Wadman
Arjan Verwer
Costy Petrisor
Logan B
Alex Seiler
Vijay Singh
Paul Hartmann
Stephen Chen
Fabian Stahl
Bagira
Odd Stråbø
Philip Herzog
Thomas Christlieb
Marek Rusinowski
Tobias Gruetzmacher
Olivier Bilodeau
Lars Vierbergen
Juanjo Benages
Xiao Di Guan
Thomas Winant
Daniel Twardowski
Jeremie Jarosh
Gerard Rovira
Marvin Ewald
Frédéric Bournival
Timendum
gritstub
Adam Voss
Mike Fährmann
Jan Kundrát
Giuseppe Fabiano
Örn Guðjónsson
Parmjit Virk
Genki Sky
Ľuboš Katrinec
Corey Nicholson
Ashutosh Chaudhary
John Dong
Tatsuyuki Ishi
Daniel Weber
Kay Bouché
Yang Hongbo
Lei Wang
Petr Novák
Leonardo Taccari
Martin Weinelt
Surya Oktafendri
TingPing
Alexandre Macabies
Bastian de Groot
Niklas Haas
András Veres-Szentkirályi
Enes Solak
Nathan Rossi
Thomas van der Berg
Luca Cherubin

View File

@@ -16,3 +16,5 @@ samiksome
alxnull
FelixFrog
Zocker1999NET
nao20010128nao
shirt-dev

View File

@@ -4,17 +4,111 @@
# Instuctions for creating release
* Run `make doc`
* Update Changelog.md and Authors-Fork
* Update Changelog.md and CONTRIBUTORS
* Change "Merged with youtube-dl" version in Readme.md if needed
* Commit to master as `Release <version>`
* Push to origin/release - build task will now run
* Update version.py and run `make issuetemplates`
* Commit to master as `[version] update :skip ci all`
* Update version.py using devscripts\update-version.py (be wary of timezones)
* Run `make issuetemplates`
* Commit to master as `[version] update :ci skip all`
* Push to origin/master
* Update changelog in /releases
-->
### 2021.02.09
* **aria2c support for DASH/HLS**: by [shirt](https://github.com/shirt-dev)
* **Implement Updater** (`-U`) by [shirt](https://github.com/shirt-dev)
* [youtube] Fix comment extraction
* [youtube_live_chat] Improve extraction
* [youtube] Fix for channel URLs sometimes not downloading all pages
* [aria2c] Changed default arguments to `--console-log-level=warn --summary-interval=0 --file-allocation=none -x16 -j16 -s16`
* Add fallback for thumbnails
* [embedthumbnail] Keep original thumbnail after conversion if write_thumbnail given
* [embedsubtitle] Keep original subtitle after conversion if write_subtitles given
* [pyinst.py] Move back to root dir
* [youtube] Simplified renderer parsing and bugfixes
* [movefiles] Fix compatibility with python2
* [remuxvideo] Fix validation of conditional remux
* [sponskrub] Don't raise error when the video does not exist
* [documentation] Crypto is an optional dependency
### 2021.02.04
* **Merge youtube-dl:** Upto [2021.02.04.1](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.04.1)
* **Date/time formatting in output template:**
* You can use [`strftime`](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) to format date/time fields. Example: `%(upload_date>%Y-%m-%d)s`
* **Multiple output templates:**
* Separate output templates can be given for the different metadata files by using `-o TYPE:TEMPLATE`
* The allowed types are: `subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson`
* [youtube] More metadata extraction for channel/playlist URLs (channel, uploader, thumbnail, tags)
* New option `--no-write-playlist-metafiles` to prevent writing playlist metadata files
* [audius] Fix extractor
* [youtube_live_chat] Fix `parse_yt_initial_data` and add `fragment_retries`
* [postprocessor] Raise errors correctly
* [metadatafromtitle] Fix bug when extracting data from numeric fields
* Fix issue with overwriting files
* Fix "Default format spec" appearing in quiet mode
* [FormatSort] Allow user to prefer av01 over vp9 (The default is still vp9)
* [FormatSort] fix bug where `quality` had more priority than `hasvid`
* [pyinst] Automatically detect python architecture and working directory
* Strip out internal fields such as `_filename` from infojson
### 2021.01.29
* **Features from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl)**: Co-authored by [animelover1984](https://github.com/animelover1984) and [bbepis](https://github.com/bbepis)
* Add `--get-comments`
* [youtube] Extract comments
* [billibilli] Added BiliBiliSearchIE, BilibiliChannelIE
* [billibilli] Extract comments
* [billibilli] Better video extraction
* Write playlist data to infojson
* [FFmpegMetadata] Embed infojson inside the video
* [EmbedThumbnail] Try embedding in mp4 using ffprobe and `-disposition`
* [EmbedThumbnail] Treat mka like mkv and mov like mp4
* [EmbedThumbnail] Embed in ogg/opus
* [VideoRemuxer] Conditionally remux video
* [VideoRemuxer] Add `-movflags +faststart` when remuxing to mp4
* [ffmpeg] Print entire stderr in verbose when there is error
* [EmbedSubtitle] Warn when embedding ass in mp4
* [anvato] Use NFLTokenGenerator if possible
* **Parse additional metadata**: New option `--parse-metadata` to extract additional metadata from existing fields
* The extracted fields can be used in `--output`
* Deprecated `--metadata-from-title`
* [Audius] Add extractor
* [youtube] Extract playlist description and write it to `.description` file
* Detect existing files even when using `recode`/`remux` (`extract-audio` is partially fixed)
* Fix wrong user config from v2021.01.24
* [youtube] Report error message from youtube as error instead of warning
* [FormatSort] Fix some fields not sorting from v2021.01.24
* [postprocessor] Deprecate `avconv`/`avprobe`. All current functionality is left untouched. But don't expect any new features to work with avconv
* [postprocessor] fix `write_debug` to not throw error when there is no `_downloader`
* [movefiles] Don't give "cant find" warning when move is unnecessary
* Refactor `update-version`, `pyinst.py` and related files
* [ffmpeg] Document more formats that are supported for remux/recode
### 2021.01.24
* **Merge youtube-dl:** Upto [2021.01.24](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* Plugin support ([documentation](https://github.com/pukkandan/yt-dlp#plugins))
* **Multiple paths**: New option `-P`/`--paths` to give different paths for different types of files
* The syntax is `-P "type:path" -P "type:path"` ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=-P,%20--paths%20TYPE:PATH))
* Valid types are: home, temp, description, annotation, subtitle, infojson, thumbnail
* Additionally, configuration file is taken from home directory or current directory ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=Home%20Configuration))
* Allow passing different arguments to different external downloaders ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=--downloader-args%20NAME:ARGS))
* [mildom] Add extractor by [nao20010128nao](https://github.com/nao20010128nao)
* Warn when using old style `--external-downloader-args` and `--post-processor-args`
* Fix `--no-overwrite` when using `--write-link`
* [sponskrub] Output `unrecognized argument` error message correctly
* [cbs] Make failure to extract title non-fatal
* Fix typecasting when pre-checking archive
* Fix issue with setting title on UNIX
* Deprecate redundant aliases in `formatSort`. The aliases remain functional for backward compatibility, but will be left undocumented
* [tests] Fix test_post_hooks
* [tests] Split core and download tests
### 2021.01.20
* [TrovoLive] Add extractor (only VODs)
* [pokemon] Add `/#/player` URLs
@@ -34,7 +128,7 @@
### 2021.01.14
* Added option `--break-on-reject`
* [roosterteeth.com] Fix for bonus episodes by @Zocker1999NET
* [roosterteeth.com] Fix for bonus episodes by [Zocker1999NET](https://github.com/Zocker1999NET)
* [tiktok] Fix for when share_info is empty
* [EmbedThumbnail] Fix bug due to incorrect function name
* [documentation] Changed sponskrub links to point to [pukkandan/sponskrub](https://github.com/pukkandan/SponSkrub) since I am now providing both linux and windows releases
@@ -43,18 +137,18 @@
### 2021.01.12
* [roosterteeth.com] Add subtitle support by @samiksome
* Added `--force-overwrites`, `--no-force-overwrites` by @alxnull
* [roosterteeth.com] Add subtitle support by [samiksome](https://github.com/samiksome)
* Added `--force-overwrites`, `--no-force-overwrites` by [alxnull](https://github.com/alxnull)
* Changed fork name to `yt-dlp`
* Fix typos by @FelixFrog
* Fix typos by [FelixFrog](https://github.com/FelixFrog)
* [ci] Option to skip
* [changelog] Added unreleased changes in blackjack4494/yt-dlc
### 2021.01.10
* [archive.org] Fix extractor and add support for audio and playlists by @wporr
* [Animelab] Added by @mariuszskon
* [youtube:search] Fix view_count by @ohnonot
* [archive.org] Fix extractor and add support for audio and playlists by [wporr](https://github.com/wporr)
* [Animelab] Added by [mariuszskon](https://github.com/mariuszskon)
* [youtube:search] Fix view_count by [ohnonot](https://github.com/ohnonot)
* [youtube] Show if video is embeddable in info
* Update version badge automatically in README
* Enable `test_youtube_search_matching`
@@ -63,11 +157,11 @@
### 2021.01.09
* [youtube] Fix bug in automatic caption extraction
* Add `post_hooks` to YoutubeDL by @alexmerkel
* Batch file enumeration improvements by @glenn-slayden
* Stop immediately when reaching `--max-downloads` by @glenn-slayden
* Fix incorrect ANSI sequence for restoring console-window title by @glenn-slayden
* Kill child processes when yt-dlc is killed by @Unrud
* Add `post_hooks` to YoutubeDL by [alexmerkel](https://github.com/alexmerkel)
* Batch file enumeration improvements by [glenn-slayden](https://github.com/glenn-slayden)
* Stop immediately when reaching `--max-downloads` by [glenn-slayden](https://github.com/glenn-slayden)
* Fix incorrect ANSI sequence for restoring console-window title by [glenn-slayden](https://github.com/glenn-slayden)
* Kill child processes when yt-dlc is killed by [Unrud](https://github.com/Unrud)
### 2021.01.08
@@ -77,11 +171,11 @@
### 2021.01.07-1
* [Akamai] fix by @nixxo
* [Tiktok] merge youtube-dl tiktok extractor by @GreyAlien502
* [vlive] add support for playlists by @kyuyeunk
* [youtube_live_chat] make sure playerOffsetMs is positive by @siikamiika
* Ignore extra data streams in ffmpeg by @jbruchon
* [Akamai] fix by [nixxo](https://github.com/nixxo)
* [Tiktok] merge youtube-dl tiktok extractor by [GreyAlien502](https://github.com/GreyAlien502)
* [vlive] add support for playlists by [kyuyeunk](https://github.com/kyuyeunk)
* [youtube_live_chat] make sure playerOffsetMs is positive by [siikamiika](https://github.com/siikamiika)
* Ignore extra data streams in ffmpeg by [jbruchon](https://github.com/jbruchon)
* Allow passing different arguments to different postprocessors using `--postprocessor-args`
* Deprecated `--sponskrub-args`. The same can now be done using `--postprocessor-args "sponskrub:<args>"`
* [CI] Split tests into core-test and full-test
@@ -111,15 +205,15 @@
* Changed video format sorting to show video only files and video+audio files together.
* Added `--video-multistreams`, `--no-video-multistreams`, `--audio-multistreams`, `--no-audio-multistreams`
* Added `b`,`w`,`v`,`a` as alias for `best`, `worst`, `video` and `audio` respectively
* **Shortcut Options:** Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by @h-h-h-h - See [Internet Shortcut Options]README.md(#internet-shortcut-options) for details
* **Shortcut Options:** Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by [h-h-h-h](https://github.com/h-h-h-h) - See [Internet Shortcut Options]README.md(#internet-shortcut-options) for details
* **Sponskrub integration:** Added `--sponskrub`, `--sponskrub-cut`, `--sponskrub-force`, `--sponskrub-location`, `--sponskrub-args` - See [SponSkrub Options](README.md#sponskrub-options-sponsorblock) for details
* Added `--force-download-archive` (`--force-write-archive`) by @h-h-h-h
* Added `--force-download-archive` (`--force-write-archive`) by [h-h-h-h](https://github.com/h-h-h-h)
* Added `--list-formats-as-table`, `--list-formats-old`
* **Negative Options:** Makes it possible to negate most boolean options by adding a `no-` to the switch. Usefull when you want to reverse an option that is defined in a config file
* Added `--no-ignore-dynamic-mpd`, `--no-allow-dynamic-mpd`, `--allow-dynamic-mpd`, `--youtube-include-hls-manifest`, `--no-youtube-include-hls-manifest`, `--no-youtube-skip-hls-manifest`, `--no-download`, `--no-download-archive`, `--resize-buffer`, `--part`, `--mtime`, `--no-keep-fragments`, `--no-cookies`, `--no-write-annotations`, `--no-write-info-json`, `--no-write-description`, `--no-write-thumbnail`, `--youtube-include-dash-manifest`, `--post-overwrites`, `--no-keep-video`, `--no-embed-subs`, `--no-embed-thumbnail`, `--no-add-metadata`, `--no-include-ads`, `--no-write-sub`, `--no-write-auto-sub`, `--no-playlist-reverse`, `--no-restrict-filenames`, `--youtube-include-dash-manifest`, `--no-format-sort-force`, `--flat-videos`, `--no-list-formats-as-table`, `--no-sponskrub`, `--no-sponskrub-cut`, `--no-sponskrub-force`
* Renamed: `--write-subs`, `--no-write-subs`, `--no-write-auto-subs`, `--write-auto-subs`. Note that these can still be used without the ending "s"
* Relaxed validation for format filters so that any arbitrary field can be used
* Fix for embedding thumbnail in mp3 by @pauldubois98
* Fix for embedding thumbnail in mp3 by [pauldubois98](https://github.com/pauldubois98) ([ytdl-org/youtube-dl#21569](https://github.com/ytdl-org/youtube-dl/pull/21569))
* Make Twitch Video ID output from Playlist and VOD extractor same. This is only a temporary fix
* **Merge youtube-dl:** Upto [2021.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
* Extractors [tiktok](https://github.com/ytdl-org/youtube-dl/commit/fb626c05867deab04425bad0c0b16b55473841a2) and [hotstar](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc) have not been merged
@@ -136,7 +230,7 @@
* Redirect channel home to /video
* Print youtube's warning message
* Multiple pages are handled better for feeds
* Add --break-on-existing by @gergesh
* Add --break-on-existing by [gergesh](https://github.com/gergesh)
* Pre-check video IDs in the archive before downloading
* [bitwave.tv] New extractor
* [Gedi] Add extractor

206
README.md
View File

@@ -1,9 +1,14 @@
# YT-DLP
<!-- See: https://github.com/marketplace/actions/dynamic-badges -->
[![Release Version](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/pukkandan/c69cb23c3c5b3316248e52022790aa57/raw/version.json&color=brightgreen)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](https://github.com/pukkandan/yt-dlp/blob/master/LICENSE)
[![Release version](https://img.shields.io/github/v/release/pukkandan/yt-dlp?color=brightgreen&label=Release)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](LICENSE)
[![CI Status](https://github.com/pukkandan/yt-dlp/workflows/Core%20Tests/badge.svg?branch=master)](https://github.com/pukkandan/yt-dlp/actions)
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&label=discord&logo=discord)](https://discord.gg/S75JaBna)
[![Commits](https://img.shields.io/github/commit-activity/m/pukkandan/yt-dlp?label=commits)](https://github.com/pukkandan/yt-dlp/commits)
[![Last Commit](https://img.shields.io/github/last-commit/pukkandan/yt-dlp/master)](https://github.com/pukkandan/yt-dlp/commits)
[![Downloads](https://img.shields.io/github/downloads/pukkandan/yt-dlp/total)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![PyPi Downloads](https://img.shields.io/pypi/dm/yt-dlp?label=PyPi)](https://pypi.org/project/yt-dlp)
A command-line program to download videos from youtube.com and many other [video platforms](docs/supportedsites.md)
@@ -51,20 +56,31 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with youtube-dl v2021.01.16**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with youtube-dl v2021.02.04.1**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/pukkandan/yt-dlp/pull/31) for details.
* **Youtube improvements**:
* All Youtube Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`) works correctly and support downloading multiple pages of content
* Youtube search works correctly (`ytsearch:`, `ytsearchdate:`) along with Search URLs
* Redirect channel's home URL automatically to `/video` to preserve the old behaviour
* **New extractors**: Trovo.live, AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv
* **New extractors**: AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv, mildom, audius
* **Fixed extractors**: archive.org, roosterteeth.com, skyit, instagram, itv, SouthparkDe, spreaker, Vlive, tiktok, akamai, ina
* **New options**: `--list-formats-as-table`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Plugin support**: Extractors can be loaded from an external file. See [plugins](#plugins) for details
* **Multiple paths and output templates**: You can give different [output templates](#output-template) and download paths for different types of files. You can also set a temporary path where intermediary files are downloaded to. See [`--paths`](https://github.com/pukkandan/yt-dlp/#:~:text=-P,%20--paths%20TYPE:PATH) for details
<!-- Relative link doesn't work for "#:~:text=" -->
* **Portable Configuration**: Configuration files are automatically loaded from the home and root directories. See [configuration](#configuration) for details
* **Other new options**: `--parse-metadata`, `--list-formats-as-table`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Improvements**: Multiple `--postprocessor-args` and `--external-downloader-args`, Date/time formatting in `-o`, faster archive checking, more [format selection options](#format-selection) etc
* **Improvements**: Multiple `--postprocessor-args`, `%(duration_string)s` in `-o`, faster archive checking, more [format selection options](#format-selection) etc
See [changelog](Changelog.md) or [commits](https://github.com/pukkandan/yt-dlp/commits) for the full list of changes
@@ -77,7 +93,7 @@ If you are coming from [youtube-dl](https://github.com/ytdl-org/youtube-dl), the
# INSTALLATION
You can install yt-dlp using one of the following methods:
* Use [PyPI package](https://pypi.org/project/yt-dlp/): `python -m pip install --upgrade yt-dlp`
* Use [PyPI package](https://pypi.org/project/yt-dlp): `python -m pip install --upgrade yt-dlp`
* Download the binary from the [latest release](https://github.com/pukkandan/yt-dlp/releases/latest)
* Use pip+git: `python -m pip install --upgrade git+https://github.com/pukkandan/yt-dlp.git@release`
* Install master branch: `python -m pip install --upgrade git+https://github.com/pukkandan/yt-dlp`
@@ -88,17 +104,16 @@ You can install yt-dlp using one of the following methods:
### COMPILE
**For Windows**:
To build the Windows executable yourself (without version info!)
To build the Windows executable, you must have pyinstaller (and optionally mutagen and Crypto)
python -m pip install --upgrade pyinstaller mutagen Crypto
Once you have all the necessary dependancies installed, just run `py pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it. It is strongly reccomended to use python3 although python2.6+ is supported.
You can also build the executable without any version info or metadata by using:
python -m pip install --upgrade pyinstaller
pyinstaller.exe youtube_dlc\__main__.py --onefile --name youtube-dlc
Or simply execute the `make_win.bat` if pyinstaller is installed.
There will be a `youtube-dlc.exe` in `/dist`
New way to build Windows is to use `python pyinst.py` (please use python3 64Bit)
For 32Bit Version use a 32Bit Version of python (3 preferred here as well) and run `python pyinst32.py`
**For Unix**:
You will need the required build tools
python, make (GNU), pandoc, zip, nosetests
@@ -106,6 +121,7 @@ Then simply type this
make
**Note**: In either platform, `devscripts\update-version.py` can be used to automatically update the version number
# DESCRIPTION
**youtube-dlc** is a command-line program to download videos from youtube.com many other [video platforms](docs/supportedsites.md). It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
@@ -309,8 +325,8 @@ Then simply type this
--downloader-args NAME:ARGS Give these arguments to the external
downloader. Specify the downloader name and
the arguments separated by a colon ":". You
can use this option multiple times (Alias:
--external-downloader-args)
can use this option multiple times
(Alias: --external-downloader-args)
## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for
@@ -319,17 +335,20 @@ Then simply type this
comments and ignored
-P, --paths TYPE:PATH The paths where the files should be
downloaded. Specify the type of file and
the path separated by a colon ":"
(supported: description|annotation|subtitle
|infojson|thumbnail). Additionally, you can
also provide "home" and "temp" paths. All
intermediary files are first downloaded to
the temp path and then the final files are
moved over to the home path after download
is finished. Note that this option is
ignored if --output is an absolute path
-o, --output TEMPLATE Output filename template, see "OUTPUT
the path separated by a colon ":". All the
same types as --output are supported.
Additionally, you can also provide "home"
and "temp" paths. All intermediary files
are first downloaded to the temp path and
then the final files are moved over to the
home path after download is finished. This
option is ignored if --output is an
absolute path
-o, --output [TYPE:]TEMPLATE Output filename template, see "OUTPUT
TEMPLATE" for details
--output-na-placeholder TEXT Placeholder value for unavailable meta
fields in output filename template
(default: "NA")
--autonumber-start NUMBER Specify the start value for %(autonumber)s
(default is 1)
--restrict-filenames Restrict filenames to only ASCII
@@ -342,9 +361,11 @@ Then simply type this
This option includes --no-continue
--no-force-overwrites Do not overwrite the video, but overwrite
related files (default)
-c, --continue Resume partially downloaded files (default)
--no-continue Restart download of partially downloaded
files from beginning
-c, --continue Resume partially downloaded files/fragments
(default)
--no-continue Do not resume partially downloaded
fragments. If the file is unfragmented,
restart download of the entire file
--part Use .part files instead of writing directly
into output file (default)
--no-part Do not use .part files - write directly
@@ -357,10 +378,18 @@ Then simply type this
file
--no-write-description Do not write video description (default)
--write-info-json Write video metadata to a .info.json file
(this may contain personal information)
--no-write-info-json Do not write video metadata (default)
--write-annotations Write video annotations to a
.annotations.xml file
--no-write-annotations Do not write video annotations (default)
--write-playlist-metafiles Write playlist metadata in addition to the
video metadata when using --write-info-json,
--write-description etc. (default)
--no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc.
--get-comments Retrieve video comments to be placed in the
.info.json file
--load-info-json FILE JSON file containing the video information
(created with the "--write-info-json"
option)
@@ -495,17 +524,17 @@ Then simply type this
--list-formats-old Present the output of -F in the old form
(Alias: --no-list-formats-as-table)
--youtube-include-dash-manifest Download the DASH manifests and related
data on YouTube videos (default) (Alias:
--no-youtube-skip-dash-manifest)
data on YouTube videos (default)
(Alias: --no-youtube-skip-dash-manifest)
--youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos (Alias:
--no-youtube-include-dash-manifest)
related data on YouTube videos
(Alias: --no-youtube-include-dash-manifest)
--youtube-include-hls-manifest Download the HLS manifests and related data
on YouTube videos (default) (Alias:
--no-youtube-skip-hls-manifest)
on YouTube videos (default)
(Alias: --no-youtube-skip-hls-manifest)
--youtube-skip-hls-manifest Do not download the HLS manifests and
related data on YouTube videos (Alias:
--no-youtube-include-hls-manifest)
related data on YouTube videos
(Alias: --no-youtube-include-hls-manifest)
--merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
@@ -549,23 +578,26 @@ Then simply type this
## Post-Processing Options:
-x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or
avprobe)
(requires ffmpeg and ffprobe)
--audio-format FORMAT Specify audio format: "best", "aac",
"flac", "mp3", "m4a", "opus", "vorbis", or
"wav"; "best" by default; No effect without
-x
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
a value between 0 (better) and 9 (worse)
for VBR or a specific bitrate like 128K
--audio-quality QUALITY Specify ffmpeg audio quality, insert a
value between 0 (better) and 9 (worse) for
VBR or a specific bitrate like 128K
(default 5)
--remux-video FORMAT Remux the video into another container if
necessary (currently supported: mp4|mkv).
If target container does not support the
video/audio codec, remuxing will fail
necessary (currently supported: mp4|mkv|flv
|webm|mov|avi|mp3|mka|m4a|ogg|opus). If
target container does not support the
video/audio codec, remuxing will fail. You
can specify multiple rules; eg.
"aac>m4a/mov>mp4/mkv" will remux aac to
m4a, mov to mp4 and anything else to mkv.
--recode-video FORMAT Re-encode the video into another format if
re-encoding is necessary (currently
supported: mp4|flv|ogg|webm|mkv|avi)
re-encoding is necessary. The supported
formats are the same as --remux-video
--postprocessor-args NAME:ARGS Give these arguments to the postprocessors.
Specify the postprocessor/executable name
and the arguments separated by a colon ":"
@@ -577,15 +609,14 @@ Then simply type this
FixupStretched, FixupM4a, FixupM3u8,
SubtitlesConvertor and EmbedThumbnail. The
supported executables are: SponSkrub,
FFmpeg, FFprobe, avconf, avprobe and
AtomicParsley. You can use this option
multiple times to give different arguments
to different postprocessors. You can also
specify "PP+EXE:ARGS" to give the arguments
to the specified executable only when being
used by the specified postprocessor. You
can use this option multiple times (Alias:
--ppa)
FFmpeg, FFprobe, and AtomicParsley. You can
use this option multiple times to give
different arguments to different
postprocessors. You can also specify
"PP+EXE:ARGS" to give the arguments to the
specified executable only when being used
by the specified postprocessor. You can use
this option multiple times (Alias: --ppa)
-k, --keep-video Keep the intermediate video file on disk
after post-processing
--no-keep-video Delete the intermediate video file after
@@ -599,16 +630,20 @@ Then simply type this
--no-embed-thumbnail Do not embed thumbnail (default)
--add-metadata Write metadata to the video file
--no-add-metadata Do not write metadata (default)
--metadata-from-title FORMAT Parse additional metadata like song title /
artist from the video title. The format
syntax is the same as --output. Regular
expression with named capture groups may
--parse-metadata FIELD:FORMAT Parse additional metadata like title/artist
from other fields. Give field name to
extract data from, and format of the field
seperated by a ":". Either regular
expression with named capture groups or a
similar syntax to the output template can
also be used. The parsed parameters replace
existing values. Example: --metadata-from-
title "%(artist)s - %(title)s" matches a
any existing values and can be use in
output templateThis option can be used
multiple times. Example: --parse-metadata
"title:%(artist)s - %(title)s" matches a
title like "Coldplay - Paradise". Example
(regex): --metadata-from-title
"(?P<artist>.+?) - (?P<title>.+)"
(regex): --parse-metadata
"description:Artist - (?P<artist>.+?)"
--xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the
@@ -616,15 +651,9 @@ Then simply type this
emit a warning), detect_or_warn (the
default; fix file if we can, warn
otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the
postprocessors (Alias: --no-prefer-ffmpeg)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors (default)
(Alias: --no-prefer-avconv)
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its
containing directory
(Alias: --avconv-location)
--ffmpeg-location PATH Location of the ffmpeg binary; either the
path to the binary or its containing
directory
--exec CMD Execute a command on the file after
downloading and post-processing, similar to
find's -exec syntax. Example: --exec 'adb
@@ -727,7 +756,11 @@ The `-o` option is used to indicate a template for the output file names while `
**tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dlc -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Allowed names along with sequence type are:
The basic usage of `-o` is not to set any template arguments when downloading a single file, like in `youtube-dlc -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Date/time fields can also be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it inside the parantheses seperated from the field name using a `>`. For example, `%(duration>%H-%M-%S)s`.
Additionally, you can set different output templates for the various metadata files seperately from the general output template by specifying the type of file followed by the template seperated by a colon ":". The different filetypes supported are `subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'` will put the thumbnails in a folder with the same name as the video.
The available fields are:
- `id` (string): Video identifier
- `title` (string): Video title
@@ -834,7 +867,7 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples
Note that on Windows you may need to use double quotes instead of single.
Note that on Windows you need to use double quotes instead of single.
```bash
$ youtube-dlc --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
@@ -846,14 +879,17 @@ youtube-dlc_test_video_.mp4 # A simple file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist
$ youtube-dlc -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download YouTube playlist videos in seperate directories according to their uploaded year
$ youtube-dlc -o '%(upload_date>%Y)s/%(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
$ youtube-dlc -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
$ youtube-dlc -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
$ youtube-dlc -u user -p password -P '~/MyVideos' -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
# Download entire series season keeping each series and each season in separate directory under C:/MyVideos
$ youtube-dlc -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617
$ youtube-dlc -P "C:/MyVideos" -o "%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617
# Stream the video being downloaded to stdout
$ youtube-dlc -o - BaW_jenozKc
@@ -862,7 +898,7 @@ $ youtube-dlc -o - BaW_jenozKc
# FORMAT SELECTION
By default, youtube-dlc tries to download the best available quality if you **don't** pass any options.
This is generally equivalent to using `-f bestvideo*+bestaudio/best`. However, if multiple audiostreams is enabled (`--audio-multistreams`), the default format changes to `-f bestvideo+bestaudio/best`. Similarly, if ffmpeg and avconv are unavailable, or if you use youtube-dlc to stream to `stdout` (`-o -`), the default becomes `-f best/bestvideo+bestaudio`.
This is generally equivalent to using `-f bestvideo*+bestaudio/best`. However, if multiple audiostreams is enabled (`--audio-multistreams`), the default format changes to `-f bestvideo+bestaudio/best`. Similarly, if ffmpeg is unavailable, or if you use youtube-dlc to stream to `stdout` (`-o -`), the default becomes `-f best/bestvideo+bestaudio`.
The general syntax for format selection is `--f FORMAT` (or `--format FORMAT`) where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download.
@@ -893,7 +929,7 @@ If you want to download multiple videos and they don't have the same formats ava
If you want to download several formats of the same video use a comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or a more sophisticated example combined with the precedence feature: `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`.
You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg or avconv installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg/avconv. If `--no-video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, if `--no-audio-multistreams` is used, all formats with an audio stream except the first one are ignored. For example, `-f bestvideo+best+bestaudio` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download and merge both formats while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`.
You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg. If `--no-video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, if `--no-audio-multistreams` is used, all formats with an audio stream except the first one are ignored. For example, `-f bestvideo+best+bestaudio` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download and merge both formats while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`.
## Filtering Formats
@@ -939,7 +975,7 @@ You can change the criteria for being considered the `best` by using `-S` (`--fo
- `quality`: The quality of the format. This is a metadata field available in some websites
- `source`: Preference of the source as given by the extractor
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8-native` > `m3u8` > `http-dash-segments` > other > `mms`/`rtsp` > unknown > `f4f`/`f4m`)
- `vcodec`: Video Codec (`vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other > unknown)
- `vcodec`: Video Codec (`av01` > `vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other > unknown)
- `acodec`: Audio Codec (`opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `ac3` > `dts` > other > unknown)
- `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other > unknown). If `--prefer-free-formats` is used, `webm` is prefered.
@@ -960,7 +996,7 @@ You can change the criteria for being considered the `best` by using `-S` (`--fo
Note that any other **numerical** field made available by the extractor can also be used. All fields, unless specified otherwise, are sorted in decending order. To reverse this, prefix the field with a `+`. Eg: `+res` prefers format with the smallest resolution. Additionally, you can suffix a prefered value for the fields, seperated by a `:`. Eg: `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two prefered values, the first for video and the second for audio. Eg: `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. Eg: `filesize~1G` prefers the format with filesize closest to 1 GiB.
The fields `hasvid`, `ie_pref`, `lang`, `quality` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--force-format-sort`. Apart from these, the default order used is: `res,fps,codec,size,br,asr,proto,ext,hasaud,source,id`. Note that the extractors may override this default order, but they cannot override the user-provided order.
The fields `hasvid`, `ie_pref`, `lang`, `quality` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--force-format-sort`. Apart from these, the default order used is: `res,fps,codec:vp9,size,br,asr,proto,ext,hasaud,source,id`. Note that the extractors may override this default order, but they cannot override the user-provided order.
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all repects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
@@ -1087,7 +1123,7 @@ $ youtube-dlc -S '+res:480,codec,br'
Plugins are loaded from `<root-dir>/ytdlp_plugins/<type>/__init__.py`. Currently only `extractor` plugins are supported. Support for `downloader` and `postprocessor` plugins may be added in the future. See [ytdlp_plugins](ytdlp_plugins) for example.
**Note**: `<root-dir>` is the directory of the binary (`<root-dir>/youtube-dlc`), or the root directory of the module if you are running directly from source-code ((`<root dir>/youtube_dlc/__main__.py`)
**Note**: `<root-dir>` is the directory of the binary (`<root-dir>/youtube-dlc`), or the root directory of the module if you are running directly from source-code (`<root dir>/youtube_dlc/__main__.py`)
# MORE
For FAQ, Developer Instructions etc., see the [original README](https://github.com/ytdl-org/youtube-dl)
For FAQ, Developer Instructions etc., see the [original README](https://github.com/ytdl-org/youtube-dl#faq)

View File

Before

Width:  |  Height:  |  Size: 4.2 KiB

After

Width:  |  Height:  |  Size: 4.2 KiB

View File

@@ -0,0 +1,31 @@
from __future__ import unicode_literals
from datetime import datetime
# import urllib.request
# response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
# old_version = response.read().decode('utf-8')
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
old_version = locals()['__version__']
old_version_list = old_version.split(".", 4)
old_ver = '.'.join(old_version_list[:3])
old_rev = old_version_list[3] if len(old_version_list) > 3 else ''
ver = datetime.now().strftime("%Y.%m.%d")
rev = str(int(old_rev or 0) + 1) if old_ver == ver else ''
VERSION = '.'.join((ver, rev)) if rev else ver
# VERSION_LIST = [(int(v) for v in ver.split(".") + [rev or 0])]
print('::set-output name=ytdlc_version::' + VERSION)
file_version_py = open('youtube_dlc/version.py', 'rt')
data = file_version_py.read()
data = data.replace(old_version, VERSION)
file_version_py.close()
file_version_py = open('youtube_dlc/version.py', 'wt')
file_version_py.write(data)
file_version_py.close()

View File

@@ -47,12 +47,13 @@
- **Amara**
- **AMCNetworks**
- **AmericasTestKitchen**
- **AmericasTestKitchenSeason**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeLab**
- **AnimeLabShows**
- **AnimeOnDemand**
- **Anvato**
- **aol.com**
- **aol.com**: Yahoo screen and movies
- **APA**
- **Aparat**
- **AppleConnect**
@@ -79,6 +80,9 @@
- **AudioBoom**
- **audiomack**
- **audiomack:album**
- **Audius**: Audius.co
- **audius:playlist**: Audius.co playlists
- **audius:track**: Audius track ID or API link. Prepend with "audius:"
- **AWAAN**
- **awaan:live**
- **awaan:season**
@@ -111,7 +115,9 @@
- **BiliBili**
- **BilibiliAudio**
- **BilibiliAudioAlbum**
- **BilibiliChannel**
- **BiliBiliPlayer**
- **BiliBiliSearch**: Bilibili video search, "bilisearch" keyword
- **BioBioChileTV**
- **Biography**
- **BIQLE**
@@ -197,8 +203,6 @@
- **CNNArticle**
- **CNNBlogs**
- **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname**
- **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv**
@@ -520,6 +524,12 @@
- **Mgoon**
- **MGTV**: 芒果TV
- **MiaoPai**
- **mildom**: Record ongoing live by specific user in Mildom
- **mildom:user:vod**: Download all VODs from specific user in Mildom
- **mildom:vod**: Download a VOD in Mildom
- **minds**
- **minds:channel**
- **minds:group**
- **MinistryGrid**
- **Minoto**
- **miomio.tv**
@@ -549,6 +559,7 @@
- **mtv:video**
- **mtvjapan**
- **mtvservices:embedded**
- **MTVUutisetArticle**
- **MuenchenTV**: münchen.tv
- **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses
@@ -880,6 +891,8 @@
- **Sport5**
- **SportBox**
- **SportDeutschland**
- **spotify**
- **spotify:show**
- **Spreaker**
- **SpreakerPage**
- **SpreakerShow**
@@ -962,13 +975,13 @@
- **TNAFlixNetworkEmbed**
- **toggle**
- **ToonGoggles**
- **Tosh**: Tosh.0
- **tou.tv**
- **Toypics**: Toypics video
- **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken)
- **Trilulilu**
- **TrovoLive**
- **Trovo**
- **TrovoVod**
- **TruNews**
- **TruTV**
- **Tube8**
@@ -1078,7 +1091,6 @@
- **vidme**
- **vidme:user**
- **vidme:user:likes**
- **Vidzi**
- **vier**: vier.be and vijf.be
- **vier:videos**
- **viewlift**
@@ -1123,6 +1135,7 @@
- **vrv**
- **vrv:series**
- **VShare**
- **VTM**
- **VTXTV**
- **vube**: Vube.com
- **VuClip**
@@ -1205,9 +1218,9 @@
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
- **youtube:playlist**: YouTube.com playlists
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
- **youtube:search**: YouTube.com searches
- **youtube:search**: YouTube.com searches, "ytsearch" keyword
- **youtube:search:date**: YouTube.com searches, newest videos first, "ytsearchdate" keyword
- **youtube:search_url**: YouTube.com searches, "ytsearch" keyword
- **youtube:search_url**: YouTube.com search URLs
- **youtube:subscriptions**: YouTube.com subscriptions feed, ":ytsubs" for short (requires authentication)
- **youtube:tab**: YouTube.com tab
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)

View File

@@ -1 +0,0 @@
py -m PyInstaller youtube_dlc\__main__.py --onefile --name youtube-dlc --version-file win\ver.txt --icon win\icon\cloud.ico --upx-exclude=vcruntime140.dll --exclude-module ytdlp_plugins

112
pyinst.py
View File

@@ -1,55 +1,41 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
import sys
# import os
import platform
from PyInstaller.utils.win32.versioninfo import (
VarStruct, VarFileInfo, StringStruct, StringTable,
StringFileInfo, FixedFileInfo, VSVersionInfo, SetVersion,
)
import PyInstaller.__main__
from datetime import datetime
arch = sys.argv[1] if len(sys.argv) > 1 else platform.architecture()[0][:2]
assert arch in ('32', '64')
print('Building %sbit version' % arch)
_x86 = '_x86' if arch == '32' else ''
FILE_DESCRIPTION = 'Media Downloader'
FILE_DESCRIPTION = 'Media Downloader%s' % (' (32 Bit)' if _x86 else '')
# root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
# print('Changing working directory to %s' % root_dir)
# os.chdir(root_dir)
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
VERSION = locals()['__version__']
_LATEST_VERSION = locals()['__version__']
VERSION_LIST = VERSION.split('.')
VERSION_LIST = list(map(int, VERSION_LIST)) + [0] * (4 - len(VERSION_LIST))
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
print('Version: %s%s' % (VERSION, _x86))
print('Remember to update the version using devscipts\\update-version.py')
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print(version)
version_list = ver.split(".")
_year, _month, _day = [int(value) for value in version_list]
_rev = 0
if rev:
_rev = rev
_ver_tuple = _year, _month, _day, _rev
version_file = VSVersionInfo(
VERSION_FILE = VSVersionInfo(
ffi=FixedFileInfo(
filevers=_ver_tuple,
prodvers=_ver_tuple,
filevers=VERSION_LIST,
prodvers=VERSION_LIST,
mask=0x3F,
flags=0x0,
OS=0x4,
@@ -58,35 +44,35 @@ version_file = VSVersionInfo(
date=(0, 0),
),
kids=[
StringFileInfo(
[
StringTable(
"040904B0",
[
StringStruct("Comments", "Youtube-dlc Command Line Interface."),
StringStruct("CompanyName", "theidel@uni-bremen.de"),
StringStruct("FileDescription", FILE_DESCRIPTION),
StringStruct("FileVersion", version),
StringStruct("InternalName", "youtube-dlc"),
StringStruct(
"LegalCopyright",
"theidel@uni-bremen.de | UNLICENSE",
),
StringStruct("OriginalFilename", "youtube-dlc.exe"),
StringStruct("ProductName", "Youtube-dlc"),
StringStruct("ProductVersion", version + " | git.io/JLh7K"),
],
)
]
),
VarFileInfo([VarStruct("Translation", [0, 1200])])
StringFileInfo([
StringTable(
'040904B0', [
StringStruct('Comments', 'Youtube-dlc%s Command Line Interface.' % _x86),
StringStruct('CompanyName', 'https://github.com/pukkandan/yt-dlp'),
StringStruct('FileDescription', FILE_DESCRIPTION),
StringStruct('FileVersion', VERSION),
StringStruct('InternalName', 'youtube-dlc%s' % _x86),
StringStruct(
'LegalCopyright',
'pukkandan@gmail.com | UNLICENSE',
),
StringStruct('OriginalFilename', 'youtube-dlc%s.exe' % _x86),
StringStruct('ProductName', 'Youtube-dlc%s' % _x86),
StringStruct('ProductVersion', '%s%s' % (VERSION, _x86)),
])]),
VarFileInfo([VarStruct('Translation', [0, 1200])])
]
)
PyInstaller.__main__.run([
'--name=youtube-dlc',
'--name=youtube-dlc%s' % _x86,
'--onefile',
'--icon=win/icon/cloud.ico',
'--icon=devscripts/cloud.ico',
'--exclude-module=youtube_dl',
'--exclude-module=test',
'--exclude-module=ytdlp_plugins',
'--hidden-import=mutagen',
'--hidden-import=Crypto',
'youtube_dlc/__main__.py',
])
SetVersion('dist/youtube-dlc.exe', version_file)
SetVersion('dist/youtube-dlc%s.exe' % _x86, VERSION_FILE)

View File

@@ -1,92 +0,0 @@
from __future__ import unicode_literals
from PyInstaller.utils.win32.versioninfo import (
VarStruct, VarFileInfo, StringStruct, StringTable,
StringFileInfo, FixedFileInfo, VSVersionInfo, SetVersion,
)
import PyInstaller.__main__
from datetime import datetime
FILE_DESCRIPTION = 'Media Downloader 32 Bit Version'
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
_LATEST_VERSION = locals()['__version__']
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print(version)
version_list = ver.split(".")
_year, _month, _day = [int(value) for value in version_list]
_rev = 0
if rev:
_rev = rev
_ver_tuple = _year, _month, _day, _rev
version_file = VSVersionInfo(
ffi=FixedFileInfo(
filevers=_ver_tuple,
prodvers=_ver_tuple,
mask=0x3F,
flags=0x0,
OS=0x4,
fileType=0x1,
subtype=0x0,
date=(0, 0),
),
kids=[
StringFileInfo(
[
StringTable(
"040904B0",
[
StringStruct("Comments", "Youtube-dlc_x86 Command Line Interface."),
StringStruct("CompanyName", "theidel@uni-bremen.de"),
StringStruct("FileDescription", FILE_DESCRIPTION),
StringStruct("FileVersion", version),
StringStruct("InternalName", "youtube-dlc_x86"),
StringStruct(
"LegalCopyright",
"theidel@uni-bremen.de | UNLICENSE",
),
StringStruct("OriginalFilename", "youtube-dlc_x86.exe"),
StringStruct("ProductName", "Youtube-dlc_x86"),
StringStruct("ProductVersion", version + "_x86 | git.io/JUGsM"),
],
)
]
),
VarFileInfo([VarStruct("Translation", [0, 1200])])
]
)
PyInstaller.__main__.run([
'--name=youtube-dlc_x86',
'--onefile',
'--icon=win/icon/cloud.ico',
'youtube_dlc/__main__.py',
])
SetVersion('dist/youtube-dlc_x86.exe', version_file)

2
requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
mutagen
Crypto

View File

@@ -1,44 +0,0 @@
from __future__ import unicode_literals
from datetime import datetime
# import urllib.request
# response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
# _LATEST_VERSION = response.read().decode('utf-8')
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
_LATEST_VERSION = locals()['__version__']
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print('::set-output name=ytdlc_version::' + version)
file_version_py = open('youtube_dlc/version.py', 'rt')
data = file_version_py.read()
data = data.replace(locals()['__version__'], version)
file_version_py.close()
file_version_py = open('youtube_dlc/version.py', 'wt')
file_version_py.write(data)
file_version_py.close()

View File

@@ -1,33 +0,0 @@
# Unused
from __future__ import unicode_literals
from datetime import datetime
import urllib.request
response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
_LATEST_VERSION = response.read().decode('utf-8')
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))

View File

@@ -2,5 +2,5 @@
universal = True
[flake8]
exclude = youtube_dlc/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv,devscripts/create-github-release.py,devscripts/release.sh,devscripts/show-downloads-statistics.py,scripts/update-version.py
exclude = youtube_dlc/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv,devscripts/create-github-release.py,devscripts/release.sh,devscripts/show-downloads-statistics.py
ignore = E402,E501,E731,E741,W503

View File

@@ -7,10 +7,12 @@ import warnings
import sys
from distutils.spawn import spawn
# Get the version from youtube_dlc/version.py without importing the package
exec(compile(open('youtube_dlc/version.py').read(),
'youtube_dlc/version.py', 'exec'))
DESCRIPTION = 'Command-line program to download videos from YouTube.com and many other other video platforms.'
LONG_DESCRIPTION = '\n\n'.join((
@@ -18,6 +20,9 @@ LONG_DESCRIPTION = '\n\n'.join((
'**PS**: Many links in this document will not work since this is a copy of the README.md from Github',
open("README.md", "r", encoding="utf-8").read()))
REQUIREMENTS = ['mutagen', 'Crypto']
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("inv")
else:
@@ -41,10 +46,8 @@ else:
params = {
'data_files': data_files,
}
#if setuptools_available:
params['entry_points'] = {'console_scripts': ['youtube-dlc = youtube_dlc:main']}
#else:
# params['scripts'] = ['bin/youtube-dlc']
class build_lazy_extractors(Command):
description = 'Build the extractor lazy loading module'
@@ -62,6 +65,9 @@ class build_lazy_extractors(Command):
dry_run=self.dry_run,
)
packages = find_packages(exclude=("youtube_dl", "test", "ytdlp_plugins"))
setup(
name="yt-dlp",
version=__version__,
@@ -71,7 +77,8 @@ setup(
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/pukkandan/yt-dlp",
packages=find_packages(exclude=("youtube_dl","test",)),
packages=packages,
install_requires=REQUIREMENTS,
project_urls={
'Documentation': 'https://github.com/pukkandan/yt-dlp#yt-dlp',
'Source': 'https://github.com/pukkandan/yt-dlp',

View File

@@ -8,10 +8,16 @@ import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dlc.postprocessor import MetadataFromTitlePP
from youtube_dlc.postprocessor import MetadataFromFieldPP, MetadataFromTitlePP
class TestMetadataFromField(unittest.TestCase):
def test_format_to_regex(self):
pp = MetadataFromFieldPP(None, ['title:%(title)s - %(artist)s'])
self.assertEqual(pp._data[0]['regex'], r'(?P<title>[^\r\n]+)\ \-\ (?P<artist>[^\r\n]+)')
class TestMetadataFromTitle(unittest.TestCase):
def test_format_to_regex(self):
pp = MetadataFromTitlePP(None, '%(title)s - %(artist)s')
self.assertEqual(pp._titleregex, r'(?P<title>.+)\ \-\ (?P<artist>.+)')
self.assertEqual(pp._titleregex, r'(?P<title>[^\r\n]+)\ \-\ (?P<artist>[^\r\n]+)')

View File

@@ -15,8 +15,6 @@ IGNORED_FILES = [
'setup.py', # http://bugs.python.org/issue13943
'conf.py',
'buildserver.py',
'pyinst.py',
'pyinst32.py',
]
IGNORED_DIRS = [

View File

@@ -1,275 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import expect_value
from youtube_dlc.extractor import YoutubeIE
class TestYoutubeChapters(unittest.TestCase):
_TEST_CASES = [
(
# https://www.youtube.com/watch?v=A22oy8dFjqc
# pattern: 00:00 - <title>
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
1477,
[{
'start_time': 36,
'end_time': 162,
'title': 'Bohemian Rhapsody',
}, {
'start_time': 162,
'end_time': 413,
'title': 'Radio Ga Ga',
}, {
'start_time': 413,
'end_time': 454,
'title': 'Ay Oh!',
}, {
'start_time': 454,
'end_time': 728,
'title': 'Hammer To Fall',
}, {
'start_time': 728,
'end_time': 963,
'title': 'Crazy Little Thing Called Love',
}, {
'start_time': 963,
'end_time': 1038,
'title': 'We Will Rock You',
}, {
'start_time': 1038,
'end_time': 1272,
'title': 'We Are The Champions',
}, {
'start_time': 1272,
'end_time': 1477,
'title': 'Is This The World We Created...?',
}]
),
(
# https://www.youtube.com/watch?v=ekYlRhALiRQ
# pattern: <num>. <title> 0:00
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
4009,
[{
'start_time': 0,
'end_time': 707,
'title': '1. Those Beaten Paths of Confusion',
}, {
'start_time': 707,
'end_time': 1590,
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
}, {
'start_time': 1590,
'end_time': 2157,
'title': '3. Poison Yourself...With Thought',
}, {
'start_time': 2157,
'end_time': 2672,
'title': '4. The Agents of Transformation',
}, {
'start_time': 2672,
'end_time': 3187,
'title': '5. Drowning in the Pain of Consciousness',
}, {
'start_time': 3187,
'end_time': 4009,
'title': '6. Deny the Disease of Life',
}]
),
(
# https://www.youtube.com/watch?v=WjL4pSzog9w
# pattern: 00:00 <title>
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
2705,
[{
'start_time': 0,
'end_time': 428,
'title': 'Christening Unborn Deformities',
}, {
'start_time': 428,
'end_time': 976,
'title': 'Taste of Purity',
}, {
'start_time': 976,
'end_time': 1485,
'title': 'Sculpting Sins of a Universal Tongue',
}, {
'start_time': 1485,
'end_time': 1884,
'title': 'Birth',
}, {
'start_time': 1884,
'end_time': 2275,
'title': 'Neves',
}, {
'start_time': 2275,
'end_time': 2705,
'title': 'Libations in Limbo',
}]
),
(
# https://www.youtube.com/watch?v=o3r1sn-t3is
# pattern: <title> 00:00 <note>
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
5640,
[{
'start_time': 45,
'end_time': 266,
'title': 'I-E-A-I-A-I-O',
}, {
'start_time': 266,
'end_time': 331,
'title': 'Suite-Pee (Incomplete)',
}, {
'start_time': 331,
'end_time': 522,
'title': 'Attack (First live performance since 2011)',
}, {
'start_time': 522,
'end_time': 752,
'title': 'Prison Song',
}, {
'start_time': 752,
'end_time': 932,
'title': 'Know (First live performance since 2011)',
}, {
'start_time': 932,
'end_time': 1153,
'title': 'Aerials',
}, {
'start_time': 1153,
'end_time': 1209,
'title': 'Soldier Side - Intro',
}, {
'start_time': 1209,
'end_time': 1472,
'title': 'B.Y.O.B.',
}, {
'start_time': 1472,
'end_time': 1668,
'title': 'Soil',
}, {
'start_time': 1668,
'end_time': 1838,
'title': 'Darts',
}, {
'start_time': 1838,
'end_time': 2105,
'title': 'Radio/Video',
}, {
'start_time': 2105,
'end_time': 2288,
'title': 'Hypnotize',
}, {
'start_time': 2288,
'end_time': 2460,
'title': 'Temper (First live performance since 1999)',
}, {
'start_time': 2460,
'end_time': 2577,
'title': 'CUBErt',
}, {
'start_time': 2577,
'end_time': 2787,
'title': 'Needles',
}, {
'start_time': 2787,
'end_time': 2978,
'title': 'Deer Dance',
}, {
'start_time': 2978,
'end_time': 3085,
'title': 'Bounce',
}, {
'start_time': 3085,
'end_time': 3232,
'title': 'Suggestions',
}, {
'start_time': 3232,
'end_time': 3493,
'title': 'Psycho',
}, {
'start_time': 3493,
'end_time': 3675,
'title': 'Chop Suey!',
}, {
'start_time': 3675,
'end_time': 3854,
'title': 'Lonely Day',
}, {
'start_time': 3854,
'end_time': 4090,
'title': 'Question!',
}, {
'start_time': 4090,
'end_time': 4420,
'title': 'Lost in Hollywood',
}, {
'start_time': 4420,
'end_time': 4577,
'title': 'Vicinity of Obscenity (First live performance since 2012)',
}, {
'start_time': 4577,
'end_time': 4802,
'title': 'Forest',
}, {
'start_time': 4802,
'end_time': 5037,
'title': 'Cigaro',
}, {
'start_time': 5037,
'end_time': 5273,
'title': 'Toxicity (with Chino Moreno)',
}, {
'start_time': 5273,
'end_time': 5640,
'title': 'Sugar',
}]
),
(
# https://www.youtube.com/watch?v=PkYLQbsqCE8
# pattern: <num> - <title> [<latinized title>] 0:00:00
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
1138,
[{
'start_time': 0,
'end_time': 490,
'title': '1 - Во прах [Vo prakh]',
}, {
'start_time': 490,
'end_time': 870,
'title': '2 - Искупление [Iskupleniye]',
}, {
'start_time': 870,
'end_time': 1138,
'title': '3 - Из серпов луны...[Iz serpov luny]',
}]
),
(
# https://www.youtube.com/watch?v=xZW70zEasOk
# time point more than duration
'''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
283,
[]
),
]
def test_youtube_chapters(self):
for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE()
expect_value(
self, ie._extract_chapters_from_description(description, duration),
expected_chapters, None)
if __name__ == '__main__':
unittest.main()

View File

@@ -86,13 +86,9 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
)
for player_url, expected_player_id in PLAYER_URLS:
expected_player_type = player_url.split('.')[-1]
player_type, player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_type, expected_player_type)
player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_id, expected_player_id)

View File

@@ -1,45 +0,0 @@
# UTF-8
#
# For more details about fixed file info 'ffi' see:
# http://msdn.microsoft.com/en-us/library/ms646997.aspx
VSVersionInfo(
ffi=FixedFileInfo(
# filevers and prodvers should be always a tuple with four items: (1, 2, 3, 4)
# Set not needed items to zero 0.
filevers=(16, 9, 2020, 0),
prodvers=(16, 9, 2020, 0),
# Contains a bitmask that specifies the valid bits 'flags'r
mask=0x3f,
# Contains a bitmask that specifies the Boolean attributes of the file.
flags=0x0,
# The operating system for which this file was designed.
# 0x4 - NT and there is no need to change it.
# OS=0x40004,
OS=0x4,
# The general type of file.
# 0x1 - the file is an application.
fileType=0x1,
# The function of the file.
# 0x0 - the function is not defined for this fileType
subtype=0x0,
# Creation date and time stamp.
date=(0, 0)
),
kids=[
StringFileInfo(
[
StringTable(
u'040904B0',
[StringStruct(u'Comments', u'Youtube-dlc Command Line Interface.'),
StringStruct(u'CompanyName', u'theidel@uni-bremen.de'),
StringStruct(u'FileDescription', u'Media Downloader'),
StringStruct(u'FileVersion', u'16.9.2020.0'),
StringStruct(u'InternalName', u'youtube-dlc'),
StringStruct(u'LegalCopyright', u'theidel@uni-bremen.de | UNLICENSE'),
StringStruct(u'OriginalFilename', u'youtube-dlc.exe'),
StringStruct(u'ProductName', u'Youtube-dlc'),
StringStruct(u'ProductVersion', u'16.9.2020.0 | git.io/JUGsM')])
]),
VarFileInfo([VarStruct(u'Translation', [0, 1200])])
]
)

View File

@@ -49,6 +49,7 @@ from .utils import (
date_from_str,
DateRange,
DEFAULT_OUTTMPL,
OUTTMPL_TYPES,
determine_ext,
determine_protocol,
DOT_DESKTOP_LINK_TEMPLATE,
@@ -61,6 +62,7 @@ from .utils import (
ExistingVideoReached,
expand_path,
ExtractorError,
float_or_none,
format_bytes,
format_field,
formatSeconds,
@@ -91,6 +93,7 @@ from .utils import (
sanitized_Request,
std_headers,
str_or_none,
strftime_or_none,
subtitles_filename,
to_high_limit_path,
UnavailableVideoError,
@@ -172,15 +175,20 @@ class YoutubeDL(object):
forcejson: Force printing info_dict as JSON.
dump_single_json: Force printing the info_dict of the whole playlist
(or video) as a single JSON line.
force_write_download_archive: Force writing download archive regardless of
'skip_download' or 'simulate'.
force_write_download_archive: Force writing download archive regardless
of 'skip_download' or 'simulate'.
simulate: Do not download the video files.
format: Video format code. see "FORMAT SELECTION" for more details.
format_sort: How to sort the video formats. see "Sorting Formats" for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats" for more details.
allow_multiple_video_streams: Allow multiple video streams to be merged into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged into a single file
outtmpl: Template for output names.
format_sort: How to sort the video formats. see "Sorting Formats"
for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats"
for more details.
allow_multiple_video_streams: Allow multiple video streams to be merged
into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged
into a single file
outtmpl: Dictionary of templates for output names. Allowed keys
are 'default' and the keys of OUTTMPL_TYPES (in utils.py)
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names
trim_file_name: Limit length of filename (extension excluded)
@@ -202,8 +210,12 @@ class YoutubeDL(object):
logtostderr: Log messages to stderr instead of stdout.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writecomments: Extract video comments. This will not be written to disk
unless writeinfojson is also given
writeannotations: Write the video annotations to a .annotations.xml file
writethumbnail: Write the thumbnail image to a file
allow_playlist_files: Whether to write playlists' description, infojson etc
also to disk when using the 'write*' options
write_all_thumbnails: Write all thumbnail formats to files
writelink: Write an internet shortcut file, depending on the
current platform (.url/.webloc/.desktop)
@@ -294,6 +306,9 @@ class YoutubeDL(object):
Progress hooks are guaranteed to be called at least once
(with status "finished") if the download is successful.
merge_output_format: Extension to use when merging formats.
final_ext: Expected final extension; used to detect when the file was
already downloaded and converted. "merge_output_format" is
replaced by this extension when given
fixup: Automatically correct known faults of the file.
One of:
- "never": do nothing
@@ -347,7 +362,7 @@ class YoutubeDL(object):
The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg.
otherwise prefer ffmpeg. (avconv support is deprecated)
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A dictionary of postprocessor/executable keys (in lower case)
@@ -375,8 +390,7 @@ class YoutubeDL(object):
params = None
_ies = []
_pps = []
_pps_end = []
_pps = {'beforedl': [], 'aftermove': [], 'normal': []}
__prepare_filename_warned = False
_download_retcode = None
_num_downloads = None
@@ -390,8 +404,7 @@ class YoutubeDL(object):
params = {}
self._ies = []
self._ies_instances = {}
self._pps = []
self._pps_end = []
self._pps = {'beforedl': [], 'aftermove': [], 'normal': []}
self.__prepare_filename_warned = False
self._post_hooks = []
self._progress_hooks = []
@@ -438,6 +451,14 @@ class YoutubeDL(object):
if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
if self.params.get('final_ext'):
if self.params.get('merge_output_format'):
self.report_warning('--merge-output-format will be ignored since --remux-video or --recode-video is given')
self.params['merge_output_format'] = self.params['final_ext']
if 'overwrites' in self.params and self.params['overwrites'] is None:
del self.params['overwrites']
check_deprecated('autonumber_size', '--autonumber-size', 'output template with %(autonumber)0Nd, where N in the number of digits')
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
@@ -479,10 +500,7 @@ class YoutubeDL(object):
'Set the LC_ALL environment variable to fix this.')
self.params['restrictfilenames'] = True
if isinstance(params.get('outtmpl'), bytes):
self.report_warning(
'Parameter outtmpl is bytes, but should be a unicode string. '
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
self.outtmpl_dict = self.parse_outtmpl()
self._setup_opener()
@@ -494,11 +512,13 @@ class YoutubeDL(object):
pp_class = get_postprocessor(pp_def_raw['key'])
pp_def = dict(pp_def_raw)
del pp_def['key']
after_move = pp_def.get('_after_move', False)
if '_after_move' in pp_def:
del pp_def['_after_move']
if 'when' in pp_def:
when = pp_def['when']
del pp_def['when']
else:
when = 'normal'
pp = pp_class(self, **compat_kwargs(pp_def))
self.add_post_processor(pp, after_move=after_move)
self.add_post_processor(pp, when=when)
for ph in self.params.get('post_hooks', []):
self.add_post_hook(ph)
@@ -550,12 +570,9 @@ class YoutubeDL(object):
for ie in gen_extractor_classes():
self.add_info_extractor(ie)
def add_post_processor(self, pp, after_move=False):
def add_post_processor(self, pp, when='normal'):
"""Add a PostProcessor object to the end of the chain."""
if after_move:
self._pps_end.append(pp)
else:
self._pps.append(pp)
self._pps[when].append(pp)
pp.set_downloader(self)
def add_post_hook(self, ph):
@@ -715,15 +732,33 @@ class YoutubeDL(object):
def report_file_delete(self, file_name):
"""Report that existing file will be deleted."""
try:
self.to_screen('Deleting already existent file %s' % file_name)
self.to_screen('Deleting existing file %s' % file_name)
except UnicodeEncodeError:
self.to_screen('Deleting already existent file')
self.to_screen('Deleting existing file')
def prepare_filename(self, info_dict, warn=False):
"""Generate the output filename."""
def parse_outtmpl(self):
outtmpl_dict = self.params.get('outtmpl', {})
if not isinstance(outtmpl_dict, dict):
outtmpl_dict = {'default': outtmpl_dict}
outtmpl_dict.update({
k: v for k, v in DEFAULT_OUTTMPL.items()
if not outtmpl_dict.get(k)})
for key, val in outtmpl_dict.items():
if isinstance(val, bytes):
self.report_warning(
'Parameter outtmpl is bytes, but should be a unicode string. '
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
return outtmpl_dict
def _prepare_filename(self, info_dict, tmpl_type='default'):
try:
template_dict = dict(info_dict)
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
formatSeconds(info_dict['duration'], '-')
if info_dict.get('duration', None) is not None
else None)
template_dict['epoch'] = int(time.time())
autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None:
@@ -744,9 +779,11 @@ class YoutubeDL(object):
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict)))
template_dict = collections.defaultdict(lambda: self.params.get('outtmpl_na_placeholder', 'NA'), template_dict)
na = self.params.get('outtmpl_na_placeholder', 'NA')
template_dict = collections.defaultdict(lambda: na, template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
outtmpl = self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default'])
force_ext = OUTTMPL_TYPES.get(tmpl_type)
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
@@ -762,27 +799,45 @@ class YoutubeDL(object):
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl)
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
'''
numeric_fields = list(self._NUMERIC_FIELDS)
# Format date
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
if key in template_dict:
continue
value = strftime_or_none(template_dict.get(field), frmt, na)
if conv_type in 'crs': # string
value = sanitize(field, value)
else: # number
numeric_fields.append(key)
value = float_or_none(value, default=None)
if value is not None:
template_dict[key] = value
# Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since
# string NA placeholder is returned for missing fields. We will patch
# output template for missing fields to meet string presentation type.
for numeric_field in self._NUMERIC_FIELDS:
for numeric_field in numeric_fields:
if numeric_field not in template_dict:
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
[diouxXeEfFgGcrs%] # conversion type
'''
outtmpl = re.sub(
FORMAT_RE.format(numeric_field),
FORMAT_RE.format(re.escape(numeric_field)),
r'%({0})s'.format(numeric_field), outtmpl)
# expand_path translates '%%' into '%' and '$$' into '$'
@@ -798,6 +853,9 @@ class YoutubeDL(object):
# title "Hello $PATH", we don't want `$PATH` to be expanded.
filename = expand_path(outtmpl).replace(sep, '') % template_dict
if force_ext is not None:
filename = replace_extension(filename, force_ext, template_dict.get('ext'))
# https://github.com/blackjack4494/youtube-dlc/issues/85
trim_file_name = self.params.get('trim_file_name', False)
if trim_file_name:
@@ -815,25 +873,28 @@ class YoutubeDL(object):
filename = encodeFilename(filename, True).decode(preferredencoding())
filename = sanitize_path(filename)
if warn and not self.__prepare_filename_warned:
if not self.params.get('paths'):
pass
elif filename == '-':
self.report_warning('--paths is ignored when an outputting to stdout')
elif os.path.isabs(filename):
self.report_warning('--paths is ignored since an absolute path is given in output template')
self.__prepare_filename_warned = True
return filename
except ValueError as err:
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
return None
def prepare_filepath(self, filename, dir_type=''):
if filename == '-':
return filename
def prepare_filename(self, info_dict, dir_type='', warn=False):
"""Generate the output filename."""
paths = self.params.get('paths', {})
assert isinstance(paths, dict)
filename = self._prepare_filename(info_dict, dir_type or 'default')
if warn and not self.__prepare_filename_warned:
if not paths:
pass
elif filename == '-':
self.report_warning('--paths is ignored when an outputting to stdout')
elif os.path.isabs(filename):
self.report_warning('--paths is ignored since an absolute path is given in output template')
self.__prepare_filename_warned = True
if filename == '-' or not filename:
return filename
homepath = expand_path(paths.get('home', '').strip())
assert isinstance(homepath, compat_str)
subdir = expand_path(paths.get(dir_type, '').strip()) if dir_type else ''
@@ -933,9 +994,7 @@ class YoutubeDL(object):
self.to_screen("[%s] %s: has already been recorded in archive" % (
ie_key, temp_id))
break
return self.__extract_info(url, ie, download, extra_info, process, info_dict)
else:
self.report_error('no suitable InfoExtractor for URL %s' % url)
@@ -987,10 +1046,6 @@ class YoutubeDL(object):
self.add_extra_info(ie_result, {
'extractor': ie.IE_NAME,
'webpage_url': url,
'duration_string': (
formatSeconds(ie_result['duration'], '-')
if ie_result.get('duration', None) is not None
else None),
'webpage_url_basename': url_basename(url),
'extractor_key': ie.ie_key(),
})
@@ -1010,10 +1065,7 @@ class YoutubeDL(object):
extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info)
or extract_flat is True):
self.__forced_printings(
ie_result,
self.prepare_filepath(self.prepare_filename(ie_result)),
incomplete=True)
self.__forced_printings(ie_result, self.prepare_filename(ie_result), incomplete=True)
return ie_result
if result_type == 'video':
@@ -1104,6 +1156,53 @@ class YoutubeDL(object):
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)
if self.params.get('allow_playlist_files', True):
ie_copy = {
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': 0
}
ie_copy.update(dict(ie_result))
def ensure_dir_exists(path):
return make_dir(path, self.report_error)
if self.params.get('writeinfojson', False):
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
if not ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Playlist metadata is already present')
else:
playlist_info = dict(ie_result)
# playlist_info['entries'] = list(playlist_info['entries']) # Entries is a generator which shouldnot be resolved here
del playlist_info['entries']
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(playlist_info), infofn)
except (OSError, IOError):
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
if self.params.get('writedescription', False):
descfn = self.prepare_filename(ie_copy, 'pl_description')
if not ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Playlist description is already present')
elif ie_result.get('description') is None:
self.report_warning('There\'s no playlist description to write.')
else:
try:
self.to_screen('[info] Writing playlist description to: ' + descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(ie_result['description'])
except (OSError, IOError):
self.report_error('Cannot write playlist description file ' + descfn)
return
playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1
@@ -1288,7 +1387,7 @@ class YoutubeDL(object):
and (
not can_merge()
or info_dict.get('is_live', False)
or self.params.get('outtmpl', DEFAULT_OUTTMPL) == '-'))
or self.outtmpl_dict['default'] == '-'))
return (
'best/bestvideo+bestaudio'
@@ -1799,7 +1898,7 @@ class YoutubeDL(object):
if req_format is None:
req_format = self._default_format_spec(info_dict, download=download)
if self.params.get('verbose'):
self._write_string('[debug] Default format spec: %s\n' % req_format)
self.to_screen('[debug] Default format spec: %s' % req_format)
format_selector = self.build_format_selector(req_format)
@@ -1948,10 +2047,12 @@ class YoutubeDL(object):
self._num_downloads += 1
filename = self.prepare_filename(info_dict, warn=True)
info_dict['_filename'] = full_filename = self.prepare_filepath(filename)
temp_filename = self.prepare_filepath(filename, 'temp')
info_dict = self.pre_process(info_dict)
info_dict['_filename'] = full_filename = self.prepare_filename(info_dict, warn=True)
temp_filename = self.prepare_filename(info_dict, 'temp')
files_to_move = {}
skip_dl = self.params.get('skip_download', False)
# Forced printings
self.__forced_printings(info_dict, full_filename, incomplete=False)
@@ -1963,7 +2064,7 @@ class YoutubeDL(object):
# Do nothing else if in simulate mode
return
if filename is None:
if full_filename is None:
return
def ensure_dir_exists(path):
@@ -1975,9 +2076,7 @@ class YoutubeDL(object):
return
if self.params.get('writedescription', False):
descfn = replace_extension(
self.prepare_filepath(filename, 'description'),
'description', info_dict.get('ext'))
descfn = self.prepare_filename(info_dict, 'description')
if not ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
@@ -1994,9 +2093,7 @@ class YoutubeDL(object):
return
if self.params.get('writeannotations', False):
annofn = replace_extension(
self.prepare_filepath(filename, 'annotation'),
'annotations.xml', info_dict.get('ext'))
annofn = self.prepare_filename(info_dict, 'annotation')
if not ensure_dir_exists(encodeFilename(annofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(annofn)):
@@ -2032,10 +2129,11 @@ class YoutubeDL(object):
# ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
sub_filename = subtitles_filename(temp_filename, sub_lang, sub_format, info_dict.get('ext'))
sub_filename_final = subtitles_filename(
self.prepare_filepath(filename, 'subtitle'),
sub_fn = self.prepare_filename(info_dict, 'subtitle')
sub_filename = subtitles_filename(
temp_filename if not skip_dl else sub_fn,
sub_lang, sub_format, info_dict.get('ext'))
sub_filename_final = subtitles_filename(sub_fn, sub_lang, sub_format, info_dict.get('ext'))
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
files_to_move[sub_filename] = sub_filename_final
@@ -2069,10 +2167,10 @@ class YoutubeDL(object):
(sub_lang, error_to_compat_str(err)))
continue
if self.params.get('skip_download', False):
if skip_dl:
if self.params.get('convertsubtitles', False):
# subconv = FFmpegSubtitlesConvertorPP(self, format=self.params.get('convertsubtitles'))
filename_real_ext = os.path.splitext(filename)[1][1:]
filename_real_ext = os.path.splitext(full_filename)[1][1:]
filename_wo_ext = (
os.path.splitext(full_filename)[0]
if filename_real_ext == info_dict['ext']
@@ -2087,29 +2185,31 @@ class YoutubeDL(object):
else:
try:
self.post_process(full_filename, info_dict, files_to_move)
except (PostProcessingError) as err:
self.report_error('postprocessing: %s' % str(err))
except PostProcessingError as err:
self.report_error('Postprocessing: %s' % str(err))
return
if self.params.get('writeinfojson', False):
infofn = replace_extension(
self.prepare_filepath(filename, 'infojson'),
'info.json', info_dict.get('ext'))
infofn = self.prepare_filename(info_dict, 'infojson')
if not ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Video description metadata is already present')
self.to_screen('[info] Video metadata is already present')
else:
self.to_screen('[info] Writing video description metadata as JSON to: ' + infofn)
self.to_screen('[info] Writing video metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(info_dict), infofn)
except (OSError, IOError):
self.report_error('Cannot write metadata to JSON file ' + infofn)
self.report_error('Cannot write video metadata to JSON file ' + infofn)
return
info_dict['__infojson_filename'] = infofn
thumbdir = os.path.dirname(self.prepare_filepath(filename, 'thumbnail'))
for thumbfn in self._write_thumbnails(info_dict, temp_filename):
files_to_move[thumbfn] = os.path.join(thumbdir, os.path.basename(thumbfn))
thumbfn = self.prepare_filename(info_dict, 'thumbnail')
thumb_fn_temp = temp_filename if not skip_dl else thumbfn
for thumb_ext in self._write_thumbnails(info_dict, thumb_fn_temp):
thumb_filename_temp = replace_extension(thumb_fn_temp, thumb_ext, info_dict.get('ext'))
thumb_filename = replace_extension(thumbfn, thumb_ext, info_dict.get('ext'))
files_to_move[thumb_filename_temp] = info_dict['__thumbnail_filename'] = thumb_filename
# Write internet shortcut files
url_link = webloc_link = desktop_link = False
@@ -2162,37 +2262,39 @@ class YoutubeDL(object):
# Download
must_record_download_archive = False
if not self.params.get('skip_download', False):
if not skip_dl:
try:
def existing_file(filename, temp_filename):
file_exists = os.path.exists(encodeFilename(filename))
tempfile_exists = (
False if temp_filename == filename
else os.path.exists(encodeFilename(temp_filename)))
if not self.params.get('overwrites', False) and (file_exists or tempfile_exists):
existing_filename = temp_filename if tempfile_exists else filename
self.to_screen('[download] %s has already been downloaded and merged' % existing_filename)
return existing_filename
if tempfile_exists:
self.report_file_delete(temp_filename)
os.remove(encodeFilename(temp_filename))
if file_exists:
self.report_file_delete(filename)
os.remove(encodeFilename(filename))
return None
def existing_file(*filepaths):
ext = info_dict.get('ext')
final_ext = self.params.get('final_ext', ext)
existing_files = []
for file in orderedSet(filepaths):
if final_ext != ext:
converted = replace_extension(file, final_ext, ext)
if os.path.exists(encodeFilename(converted)):
existing_files.append(converted)
if os.path.exists(encodeFilename(file)):
existing_files.append(file)
if not existing_files or self.params.get('overwrites', False):
for file in orderedSet(existing_files):
self.report_file_delete(file)
os.remove(encodeFilename(file))
return None
self.report_file_already_downloaded(existing_files[0])
info_dict['ext'] = os.path.splitext(existing_files[0])[1][1:]
return existing_files[0]
success = True
if info_dict.get('requested_formats') is not None:
downloaded = []
merger = FFmpegMergerPP(self)
if not merger.available:
postprocessors = []
self.report_warning('You have requested multiple '
'formats but ffmpeg or avconv are not installed.'
'formats but ffmpeg is not installed.'
' The formats won\'t be merged.')
else:
postprocessors = [merger]
def compatible_formats(formats):
# TODO: some formats actually allow this (mkv, webm, ogg, mp4), but not all of them.
@@ -2237,14 +2339,15 @@ class YoutubeDL(object):
new_info = dict(info_dict)
new_info.update(f)
fname = prepend_extension(
self.prepare_filepath(self.prepare_filename(new_info), 'temp'),
self.prepare_filename(new_info, 'temp'),
'f%s' % f['format_id'], new_info['ext'])
if not ensure_dir_exists(fname):
return
downloaded.append(fname)
partial_success, real_download = dl(fname, new_info)
success = success and partial_success
info_dict['__postprocessors'] = postprocessors
if merger.available:
info_dict['__postprocessors'].append(merger)
info_dict['__files_to_merge'] = downloaded
# Even if there were no downloads, it is being merged only now
info_dict['__real_download'] = True
@@ -2267,13 +2370,13 @@ class YoutubeDL(object):
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success and filename != '-':
if success and full_filename != '-':
# Fixup content
fixup_policy = self.params.get('fixup')
if fixup_policy is None:
fixup_policy = 'detect_or_warn'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg or avconv to fix this automatically.'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg to fix this automatically.'
stretched_ratio = info_dict.get('stretched_ratio')
if stretched_ratio is not None and stretched_ratio != 1:
@@ -2292,7 +2395,8 @@ class YoutubeDL(object):
assert fixup_policy in ('ignore', 'never')
if (info_dict.get('requested_formats') is None
and info_dict.get('container') == 'm4a_dash'):
and info_dict.get('container') == 'm4a_dash'
and info_dict.get('ext') == 'm4a'):
if fixup_policy == 'warn':
self.report_warning(
'%s: writing DASH m4a. '
@@ -2329,8 +2433,8 @@ class YoutubeDL(object):
try:
self.post_process(dl_filename, info_dict, files_to_move)
except (PostProcessingError) as err:
self.report_error('postprocessing: %s' % str(err))
except PostProcessingError as err:
self.report_error('Postprocessing: %s' % str(err))
return
try:
for ph in self._post_hooks:
@@ -2348,7 +2452,7 @@ class YoutubeDL(object):
def download(self, url_list):
"""Download a given list of URLs."""
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
outtmpl = self.outtmpl_dict['default']
if (len(url_list) > 1
and outtmpl != '-'
and '%' not in outtmpl
@@ -2396,45 +2500,48 @@ class YoutubeDL(object):
@staticmethod
def filter_requested_info(info_dict):
fields_to_remove = ('requested_formats', 'requested_subtitles')
return dict(
(k, v) for k, v in info_dict.items()
if k not in ['requested_formats', 'requested_subtitles'])
if (k[0] != '_' or k == '_type') and k not in fields_to_remove)
def run_pp(self, pp, infodict, files_to_move={}):
files_to_delete = []
files_to_delete, infodict = pp.run(infodict)
if not files_to_delete:
return files_to_move, infodict
if self.params.get('keepvideo', False):
for f in files_to_delete:
files_to_move.setdefault(f, '')
else:
for old_filename in set(files_to_delete):
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
try:
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded original file')
if old_filename in files_to_move:
del files_to_move[old_filename]
return files_to_move, infodict
def pre_process(self, ie_info):
info = dict(ie_info)
for pp in self._pps['beforedl']:
info = self.run_pp(pp, info)[1]
return info
def post_process(self, filename, ie_info, files_to_move={}):
"""Run all the postprocessors on the given file."""
info = dict(ie_info)
info['filepath'] = filename
info['__files_to_move'] = {}
def run_pp(pp):
files_to_delete = []
infodict = info
try:
files_to_delete, infodict = pp.run(infodict)
except PostProcessingError as e:
self.report_error(e.msg)
if not files_to_delete:
return infodict
if self.params.get('keepvideo', False):
for f in files_to_delete:
files_to_move.setdefault(f, '')
else:
for old_filename in set(files_to_delete):
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
try:
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded original file')
if old_filename in files_to_move:
del files_to_move[old_filename]
return infodict
for pp in ie_info.get('__postprocessors', []) + self._pps:
info = run_pp(pp)
info = run_pp(MoveFilesAfterDownloadPP(self, files_to_move))
files_to_move = {}
for pp in self._pps_end:
info = run_pp(pp)
for pp in ie_info.get('__postprocessors', []) + self._pps['normal']:
files_to_move, info = self.run_pp(pp, info, files_to_move)
info = self.run_pp(MoveFilesAfterDownloadPP(self, files_to_move), info)[1]
for pp in self._pps['aftermove']:
info = self.run_pp(pp, info, {})[1]
def _make_archive_id(self, info_dict):
video_id = info_dict.get('id')
@@ -2785,25 +2892,22 @@ class YoutubeDL(object):
encoding = preferredencoding()
return encoding
def _write_thumbnails(self, info_dict, filename):
if self.params.get('writethumbnail', False):
thumbnails = info_dict.get('thumbnails')
if thumbnails:
thumbnails = [thumbnails[-1]]
elif self.params.get('write_all_thumbnails', False):
def _write_thumbnails(self, info_dict, filename): # return the extensions
write_all = self.params.get('write_all_thumbnails', False)
thumbnails = []
if write_all or self.params.get('writethumbnail', False):
thumbnails = info_dict.get('thumbnails') or []
else:
thumbnails = []
multiple = write_all and len(thumbnails) > 1
ret = []
for t in thumbnails:
for t in thumbnails[::1 if write_all else -1]:
thumb_ext = determine_ext(t['url'], 'jpg')
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
t['filename'] = thumb_filename = replace_extension(filename + suffix, thumb_ext, info_dict.get('ext'))
suffix = '%s.' % t['id'] if multiple else ''
thumb_display_id = '%s ' % t['id'] if multiple else ''
t['filename'] = thumb_filename = replace_extension(filename, suffix + thumb_ext, info_dict.get('ext'))
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(thumb_filename)):
ret.append(thumb_filename)
ret.append(suffix + thumb_ext)
self.to_screen('[%s] %s: Thumbnail %sis already present' %
(info_dict['extractor'], info_dict['id'], thumb_display_id))
else:
@@ -2813,10 +2917,12 @@ class YoutubeDL(object):
uf = self.urlopen(t['url'])
with open(encodeFilename(thumb_filename), 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf)
ret.append(thumb_filename)
ret.append(suffix + thumb_ext)
self.to_screen('[%s] %s: Writing thumbnail %sto: %s' %
(info_dict['extractor'], info_dict['id'], thumb_display_id, thumb_filename))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download thumbnail "%s": %s' %
(t['url'], error_to_compat_str(err)))
if ret and not write_all:
break
return ret

View File

@@ -23,7 +23,6 @@ from .compat import (
from .utils import (
DateRange,
decodeOption,
DEFAULT_OUTTMPL,
DownloadError,
ExistingVideoReached,
expand_path,
@@ -32,11 +31,12 @@ from .utils import (
preferredencoding,
read_batch_urls,
RejectedVideoReached,
REMUX_EXTENSIONS,
render_table,
SameFileError,
setproctitle,
std_headers,
write_string,
render_table,
)
from .update import update_self
from .downloader import (
@@ -45,6 +45,7 @@ from .downloader import (
from .extractor import gen_extractors, list_extractors
from .extractor.common import InfoExtractor
from .extractor.adobepass import MSO_INFO
from .postprocessor.metadatafromfield import MetadataFromFieldPP
from .YoutubeDL import YoutubeDL
@@ -208,12 +209,17 @@ def _real_main(argv=None):
opts.audioquality = opts.audioquality.strip('k').strip('K')
if not opts.audioquality.isdigit():
parser.error('invalid audio quality specified')
if opts.remuxvideo is not None:
if opts.remuxvideo not in ['mp4', 'mkv']:
parser.error('invalid video container format specified')
if opts.recodevideo is not None:
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv', 'avi']:
if opts.recodevideo not in REMUX_EXTENSIONS:
parser.error('invalid video recode format specified')
if opts.remuxvideo and opts.recodevideo:
opts.remuxvideo = None
write_string('WARNING: --remux-video is ignored since --recode-video was given\n', out=sys.stderr)
if opts.remuxvideo is not None:
opts.remuxvideo = opts.remuxvideo.replace(' ', '')
remux_regex = r'{0}(?:/{0})*$'.format(r'(?:\w+>)?(?:%s)' % '|'.join(REMUX_EXTENSIONS))
if not re.match(remux_regex, opts.remuxvideo):
parser.error('invalid video remux format specified')
if opts.convertsubtitles is not None:
if opts.convertsubtitles not in ['srt', 'vtt', 'ass', 'lrc']:
parser.error('invalid subtitle format specified')
@@ -227,38 +233,45 @@ def _real_main(argv=None):
if opts.extractaudio and not opts.keepvideo and opts.format is None:
opts.format = 'bestaudio/best'
# --all-sub automatically sets --write-sub if --write-auto-sub is not given
# this was the old behaviour if only --all-sub was given.
if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True
outtmpl = ((opts.outtmpl is not None and opts.outtmpl)
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
or (opts.useid and '%(id)s.%(ext)s')
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
or DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
outtmpl = opts.outtmpl
if not outtmpl:
outtmpl = {'default': (
'%(title)s-%(id)s-%(format)s.%(ext)s' if opts.format == '-1' and opts.usetitle
else '%(id)s-%(format)s.%(ext)s' if opts.format == '-1'
else '%(autonumber)s-%(title)s-%(id)s.%(ext)s' if opts.usetitle and opts.autonumber
else '%(title)s-%(id)s.%(ext)s' if opts.usetitle
else '%(id)s.%(ext)s' if opts.useid
else '%(autonumber)s-%(id)s.%(ext)s' if opts.autonumber
else None)}
outtmpl_default = outtmpl.get('default')
if outtmpl_default is not None and not os.path.splitext(outtmpl_default)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
' template'.format(outtmpl))
' template'.format(outtmpl_default))
for f in opts.format_sort:
if re.match(InfoExtractor.FormatSort.regex, f) is None:
parser.error('invalid format sort string "%s" specified' % f)
if opts.metafromfield is None:
opts.metafromfield = []
if opts.metafromtitle is not None:
opts.metafromfield.append('title:%s' % opts.metafromtitle)
for f in opts.metafromfield:
if re.match(MetadataFromFieldPP.regex, f) is None:
parser.error('invalid format string "%s" specified for --parse-metadata' % f)
any_getting = opts.geturl or opts.gettitle or opts.getid or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat or opts.getduration or opts.dumpjson or opts.dump_single_json
any_printing = opts.print_json
download_archive_fn = expand_path(opts.download_archive) if opts.download_archive is not None else opts.download_archive
# PostProcessors
postprocessors = []
if opts.metafromtitle:
if opts.metafromfield:
postprocessors.append({
'key': 'MetadataFromTitle',
'titleformat': opts.metafromtitle
'key': 'MetadataFromField',
'formats': opts.metafromfield,
'when': 'beforedl'
})
if opts.extractaudio:
postprocessors.append({
@@ -293,9 +306,17 @@ def _real_main(argv=None):
'format': opts.convertsubtitles,
})
if opts.embedsubtitles:
already_have_subtitle = opts.writesubtitles
postprocessors.append({
'key': 'FFmpegEmbedSubtitle',
'already_have_subtitle': already_have_subtitle
})
if not already_have_subtitle:
opts.writesubtitles = True
# --all-sub automatically sets --write-sub if --write-auto-sub is not given
# this was the old behaviour if only --all-sub was given.
if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True
if opts.embedthumbnail:
already_have_thumbnail = opts.writethumbnail or opts.write_all_thumbnails
postprocessors.append({
@@ -324,7 +345,7 @@ def _real_main(argv=None):
postprocessors.append({
'key': 'ExecAfterDownload',
'exec_cmd': opts.exec_cmd,
'_after_move': True
'when': 'aftermove'
})
_args_compat_warning = 'WARNING: %s given without specifying name. The arguments will be given to all %s\n'
@@ -336,6 +357,12 @@ def _real_main(argv=None):
opts.postprocessor_args.setdefault('sponskrub', [])
opts.postprocessor_args['default'] = opts.postprocessor_args['default-compat']
final_ext = (
opts.recodevideo
or (opts.remuxvideo in REMUX_EXTENSIONS) and opts.remuxvideo
or (opts.extractaudio and opts.audioformat != 'best') and opts.audioformat
or None)
match_filter = (
None if opts.match_filter is None
else match_filter_func(opts.match_filter))
@@ -397,13 +424,14 @@ def _real_main(argv=None):
'playlistreverse': opts.playlist_reverse,
'playlistrandom': opts.playlist_random,
'noplaylist': opts.noplaylist,
'logtostderr': opts.outtmpl == '-',
'logtostderr': outtmpl_default == '-',
'consoletitle': opts.consoletitle,
'nopart': opts.nopart,
'updatetime': opts.updatetime,
'writedescription': opts.writedescription,
'writeannotations': opts.writeannotations,
'writeinfojson': opts.writeinfojson,
'writeinfojson': opts.writeinfojson or opts.getcomments,
'getcomments': opts.getcomments,
'writethumbnail': opts.writethumbnail,
'write_all_thumbnails': opts.write_all_thumbnails,
'writelink': opts.writelink,
@@ -454,6 +482,7 @@ def _real_main(argv=None):
'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched,
'merge_output_format': opts.merge_output_format,
'final_ext': final_ext,
'postprocessors': postprocessors,
'fixup': opts.fixup,
'source_address': opts.source_address,

View File

@@ -1,11 +1,24 @@
from __future__ import unicode_literals
from ..utils import (
determine_protocol,
)
def _get_real_downloader(info_dict, protocol=None, *args, **kwargs):
info_copy = info_dict.copy()
if protocol:
info_copy['protocol'] = protocol
return get_suitable_downloader(info_copy, *args, **kwargs)
# Some of these require _get_real_downloader
from .common import FileDownloader
from .dash import DashSegmentsFD
from .f4m import F4mFD
from .hls import HlsFD
from .http import HttpFD
from .rtmp import RtmpFD
from .dash import DashSegmentsFD
from .rtsp import RtspFD
from .ism import IsmFD
from .youtube_live_chat import YoutubeLiveChatReplayFD
@@ -14,10 +27,6 @@ from .external import (
FFmpegFD,
)
from ..utils import (
determine_protocol,
)
PROTOCOL_MAP = {
'rtmp': RtmpFD,
'm3u8_native': HlsFD,
@@ -31,7 +40,7 @@ PROTOCOL_MAP = {
}
def get_suitable_downloader(info_dict, params={}):
def get_suitable_downloader(info_dict, params={}, default=HttpFD):
"""Get the downloader class that can handle the info dict."""
protocol = determine_protocol(info_dict)
info_dict['protocol'] = protocol
@@ -45,16 +54,17 @@ def get_suitable_downloader(info_dict, params={}):
if ed.can_download(info_dict):
return ed
if protocol.startswith('m3u8') and info_dict.get('is_live'):
return FFmpegFD
if protocol.startswith('m3u8'):
if info_dict.get('is_live'):
return FFmpegFD
elif _get_real_downloader(info_dict, 'frag_urls', params, None):
return HlsFD
elif params.get('hls_prefer_native') is True:
return HlsFD
elif params.get('hls_prefer_native') is False:
return FFmpegFD
if protocol == 'm3u8' and params.get('hls_prefer_native') is True:
return HlsFD
if protocol == 'm3u8_native' and params.get('hls_prefer_native') is False:
return FFmpegFD
return PROTOCOL_MAP.get(protocol, HttpFD)
return PROTOCOL_MAP.get(protocol, default)
__all__ = [

View File

@@ -332,7 +332,7 @@ class FileDownloader(object):
"""
nooverwrites_and_exists = (
not self.params.get('overwrites', True)
not self.params.get('overwrites', subtitle)
and os.path.exists(encodeFilename(filename))
)

View File

@@ -1,6 +1,8 @@
from __future__ import unicode_literals
from ..downloader import _get_real_downloader
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import (
DownloadError,
@@ -20,31 +22,42 @@ class DashSegmentsFD(FragmentFD):
fragments = info_dict['fragments'][:1] if self.params.get(
'test', False) else info_dict['fragments']
real_downloader = _get_real_downloader(info_dict, 'frag_urls', self.params, None)
ctx = {
'filename': filename,
'total_frags': len(fragments),
}
self._prepare_and_start_frag_download(ctx)
if real_downloader:
self._prepare_external_frag_download(ctx)
else:
self._prepare_and_start_frag_download(ctx)
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
fragment_urls = []
frag_index = 0
for i, fragment in enumerate(fragments):
frag_index += 1
if frag_index <= ctx['fragment_index']:
continue
fragment_url = fragment.get('url')
if not fragment_url:
assert fragment_base_url
fragment_url = urljoin(fragment_base_url, fragment['path'])
if real_downloader:
fragment_urls.append(fragment_url)
continue
# In DASH, the first segment contains necessary headers to
# generate a valid MP4 file, so always abort for the first segment
fatal = i == 0 or not skip_unavailable_fragments
count = 0
while count <= fragment_retries:
try:
fragment_url = fragment.get('url')
if not fragment_url:
assert fragment_base_url
fragment_url = urljoin(fragment_base_url, fragment['path'])
success, frag_content = self._download_fragment(ctx, fragment_url, info_dict)
if not success:
return False
@@ -75,6 +88,16 @@ class DashSegmentsFD(FragmentFD):
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False
self._finish_frag_download(ctx)
if real_downloader:
info_copy = info_dict.copy()
info_copy['url_list'] = fragment_urls
fd = real_downloader(self.ydl, self.params)
# TODO: Make progress updates work without hooking twice
# for ph in self._progress_hooks:
# fd.add_progress_hook(ph)
success = fd.real_download(filename, info_copy)
if not success:
return False
else:
self._finish_frag_download(ctx)
return True

View File

@@ -5,6 +5,13 @@ import re
import subprocess
import sys
import time
import shutil
try:
from Crypto.Cipher import AES
can_decrypt_frag = True
except ImportError:
can_decrypt_frag = False
from .common import FileDownloader
from ..compat import (
@@ -18,15 +25,19 @@ from ..utils import (
cli_bool_option,
cli_configuration_args,
encodeFilename,
error_to_compat_str,
encodeArgument,
handle_youtubedl_headers,
check_executable,
is_outdated_version,
process_communicate_or_kill,
sanitized_Request,
)
class ExternalFD(FileDownloader):
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps')
def real_download(self, filename, info_dict):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
@@ -79,7 +90,7 @@ class ExternalFD(FileDownloader):
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')
return info_dict['protocol'] in cls.SUPPORTED_PROTOCOLS
@classmethod
def can_download(cls, info_dict):
@@ -109,8 +120,47 @@ class ExternalFD(FileDownloader):
_, stderr = process_communicate_or_kill(p)
if p.returncode != 0:
self.to_stderr(stderr.decode('utf-8', 'replace'))
if 'url_list' in info_dict:
file_list = []
for [i, url] in enumerate(info_dict['url_list']):
tmpsegmentname = '%s_%s.frag' % (tmpfilename, i)
file_list.append(tmpsegmentname)
with open(tmpfilename, 'wb') as dest:
for i in file_list:
if 'decrypt_info' in info_dict:
decrypt_info = info_dict['decrypt_info']
with open(i, 'rb') as src:
if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV')
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read()
encrypted_data = src.read()
decrypted_data = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(encrypted_data)
dest.write(decrypted_data)
else:
shutil.copyfileobj(open(i, 'rb'), dest)
else:
shutil.copyfileobj(open(i, 'rb'), dest)
if not self.params.get('keep_fragments', False):
for file_path in file_list:
try:
os.remove(file_path)
except OSError as ose:
self.report_error("Unable to delete file %s; %s" % (file_path, error_to_compat_str(ose)))
try:
file_path = '%s.frag.urls' % tmpfilename
os.remove(file_path)
except OSError as ose:
self.report_error("Unable to delete file %s; %s" % (file_path, error_to_compat_str(ose)))
return p.returncode
def _prepare_url(self, info_dict, url):
headers = info_dict.get('http_headers')
return sanitized_Request(url, None, headers) if headers else url
class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
@@ -186,15 +236,17 @@ class WgetFD(ExternalFD):
class Aria2cFD(ExternalFD):
AVAILABLE_OPT = '-v'
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'frag_urls')
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c']
cmd += self._configuration_args([
'--min-split-size', '1M', '--max-connection-per-server', '4'])
dn = os.path.dirname(tmpfilename)
if 'url_list' not in info_dict:
cmd += ['--out', os.path.basename(tmpfilename)]
verbose_level_args = ['--console-log-level=warn', '--summary-interval=0']
cmd += self._configuration_args(['--file-allocation=none', '-x16', '-j16', '-s16'] + verbose_level_args)
if dn:
cmd += ['--dir', dn]
cmd += ['--out', os.path.basename(tmpfilename)]
if info_dict.get('http_headers') is not None:
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
@@ -202,7 +254,21 @@ class Aria2cFD(ExternalFD):
cmd += self._option('--all-proxy', 'proxy')
cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=')
cmd += self._bool_option('--remote-time', 'updatetime', 'true', 'false', '=')
cmd += ['--', info_dict['url']]
cmd += ['--auto-file-renaming=false']
if 'url_list' in info_dict:
cmd += verbose_level_args
cmd += ['--uri-selector', 'inorder', '--download-result=hide']
url_list_file = '%s.frag.urls' % tmpfilename
url_list = []
for [i, url] in enumerate(info_dict['url_list']):
tmpsegmentname = '%s_%s.frag' % (os.path.basename(tmpfilename), i)
url_list.append('%s\n\tout=%s' % (url, tmpsegmentname))
with open(url_list_file, 'w') as f:
f.write('\n'.join(url_list))
cmd += ['-i', url_list_file]
else:
cmd += ['--', info_dict['url']]
return cmd
@@ -221,9 +287,7 @@ class HttpieFD(ExternalFD):
class FFmpegFD(ExternalFD):
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
@classmethod
def available(cls):
@@ -233,7 +297,7 @@ class FFmpegFD(ExternalFD):
url = info_dict['url']
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
self.report_error('m3u8 download detected but ffmpeg could not be found. Please install')
return False
ffpp.check_version()

View File

@@ -277,3 +277,24 @@ class FragmentFD(FileDownloader):
'status': 'finished',
'elapsed': elapsed,
})
def _prepare_external_frag_download(self, ctx):
if 'live' not in ctx:
ctx['live'] = False
if not ctx['live']:
total_frags_str = '%d' % ctx['total_frags']
ad_frags = ctx.get('ad_frags', 0)
if ad_frags:
total_frags_str += ' (not including %d ad)' % ad_frags
else:
total_frags_str = 'unknown (live)'
self.to_screen(
'[%s] Total fragments: %s' % (self.FD_NAME, total_frags_str))
tmpfilename = self.temp_name(ctx['filename'])
# Should be initialized before ytdl file check
ctx.update({
'tmpfilename': tmpfilename,
'fragment_index': 0,
})

View File

@@ -8,6 +8,7 @@ try:
except ImportError:
can_decrypt_frag = False
from ..downloader import _get_real_downloader
from .fragment import FragmentFD
from .external import FFmpegFD
@@ -73,10 +74,13 @@ class HlsFD(FragmentFD):
'hlsnative has detected features it does not support, '
'extraction will be delegated to ffmpeg')
fd = FFmpegFD(self.ydl, self.params)
for ph in self._progress_hooks:
fd.add_progress_hook(ph)
# TODO: Make progress updates work without hooking twice
# for ph in self._progress_hooks:
# fd.add_progress_hook(ph)
return fd.real_download(filename, info_dict)
real_downloader = _get_real_downloader(info_dict, 'frag_urls', self.params, None)
def is_ad_fragment_start(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
@@ -85,6 +89,8 @@ class HlsFD(FragmentFD):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment'))
fragment_urls = []
media_frags = 0
ad_frags = 0
ad_frag_next = False
@@ -109,7 +115,10 @@ class HlsFD(FragmentFD):
'ad_frags': ad_frags,
}
self._prepare_and_start_frag_download(ctx)
if real_downloader:
self._prepare_external_frag_download(ctx)
else:
self._prepare_and_start_frag_download(ctx)
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
@@ -140,6 +149,11 @@ class HlsFD(FragmentFD):
else compat_urlparse.urljoin(man_url, line))
if extra_query:
frag_url = update_url_query(frag_url, extra_query)
if real_downloader:
fragment_urls.append(frag_url)
continue
count = 0
headers = info_dict.get('http_headers', {})
if byte_range:
@@ -168,6 +182,7 @@ class HlsFD(FragmentFD):
self.report_error(
'giving up after %s fragment retries' % fragment_retries)
return False
if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
@@ -211,6 +226,17 @@ class HlsFD(FragmentFD):
elif is_ad_fragment_end(line):
ad_frag_next = False
self._finish_frag_download(ctx)
if real_downloader:
info_copy = info_dict.copy()
info_copy['url_list'] = fragment_urls
info_copy['decrypt_info'] = decrypt_info
fd = real_downloader(self.ydl, self.params)
# TODO: Make progress updates work without hooking twice
# for ph in self._progress_hooks:
# fd.add_progress_hook(ph)
success = fd.real_download(filename, info_copy)
if not success:
return False
else:
self._finish_frag_download(ctx)
return True

View File

@@ -4,6 +4,9 @@ import re
import json
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import try_get
from ..extractor.youtube import YoutubeBaseInfoExtractor as YT_BaseIE
class YoutubeLiveChatReplayFD(FragmentFD):
@@ -15,6 +18,7 @@ class YoutubeLiveChatReplayFD(FragmentFD):
video_id = info_dict['video_id']
self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
fragment_retries = self.params.get('fragment_retries', 0)
test = self.params.get('test', False)
ctx = {
@@ -28,15 +32,61 @@ class YoutubeLiveChatReplayFD(FragmentFD):
return self._download_fragment(ctx, url, info_dict, headers)
def parse_yt_initial_data(data):
window_patt = b'window\\["ytInitialData"\\]\\s*=\\s*(.*?)(?<=});'
var_patt = b'var\\s+ytInitialData\\s*=\\s*(.*?)(?<=});'
for patt in window_patt, var_patt:
patterns = (
r'%s\\s*%s' % (YT_BaseIE._YT_INITIAL_DATA_RE, YT_BaseIE._YT_INITIAL_BOUNDARY_RE),
r'%s' % YT_BaseIE._YT_INITIAL_DATA_RE)
data = data.decode('utf-8', 'replace')
for patt in patterns:
try:
raw_json = re.search(patt, data).group(1)
return json.loads(raw_json)
except AttributeError:
continue
def download_and_parse_fragment(url, frag_index):
count = 0
while count <= fragment_retries:
try:
success, raw_fragment = dl_fragment(url)
if not success:
return False, None, None
data = parse_yt_initial_data(raw_fragment)
if not data:
raw_data = json.loads(raw_fragment)
# sometimes youtube replies with a list
if not isinstance(raw_data, list):
raw_data = [raw_data]
try:
data = next(item['response'] for item in raw_data if 'response' in item)
except StopIteration:
data = {}
live_chat_continuation = try_get(
data,
lambda x: x['continuationContents']['liveChatContinuation'], dict) or {}
offset = continuation_id = None
processed_fragment = bytearray()
for action in live_chat_continuation.get('actions', []):
if 'replayChatItemAction' in action:
replay_chat_item_action = action['replayChatItemAction']
offset = int(replay_chat_item_action['videoOffsetTimeMsec'])
processed_fragment.extend(
json.dumps(action, ensure_ascii=False).encode('utf-8') + b'\n')
if offset is not None:
continuation_id = try_get(
live_chat_continuation,
lambda x: x['continuations'][0]['liveChatReplayContinuationData']['continuation'])
self._append_fragment(ctx, processed_fragment)
return True, continuation_id, offset
except compat_urllib_error.HTTPError as err:
count += 1
if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries:
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False, None, None
self._prepare_and_start_frag_download(ctx)
success, raw_fragment = dl_fragment(
@@ -44,54 +94,25 @@ class YoutubeLiveChatReplayFD(FragmentFD):
if not success:
return False
data = parse_yt_initial_data(raw_fragment)
continuation_id = data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
continuation_id = try_get(
data,
lambda x: x['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation'])
# no data yet but required to call _append_fragment
self._append_fragment(ctx, b'')
first = True
offset = None
frag_index = offset = 0
while continuation_id is not None:
data = None
if first:
url = 'https://www.youtube.com/live_chat_replay?continuation={}'.format(continuation_id)
success, raw_fragment = dl_fragment(url)
if not success:
return False
data = parse_yt_initial_data(raw_fragment)
else:
url = ('https://www.youtube.com/live_chat_replay/get_live_chat_replay'
+ '?continuation={}'.format(continuation_id)
+ '&playerOffsetMs={}'.format(max(offset - 5000, 0))
+ '&hidden=false'
+ '&pbj=1')
success, raw_fragment = dl_fragment(url)
if not success:
return False
data = json.loads(raw_fragment)['response']
first = False
continuation_id = None
live_chat_continuation = data['continuationContents']['liveChatContinuation']
offset = None
processed_fragment = bytearray()
if 'actions' in live_chat_continuation:
for action in live_chat_continuation['actions']:
if 'replayChatItemAction' in action:
replay_chat_item_action = action['replayChatItemAction']
offset = int(replay_chat_item_action['videoOffsetTimeMsec'])
processed_fragment.extend(
json.dumps(action, ensure_ascii=False).encode('utf-8') + b'\n')
try:
continuation_id = live_chat_continuation['continuations'][0]['liveChatReplayContinuationData']['continuation']
except KeyError:
continuation_id = None
self._append_fragment(ctx, processed_fragment)
if test or offset is None:
frag_index += 1
url = ''.join((
'https://www.youtube.com/live_chat_replay',
'/get_live_chat_replay' if frag_index > 1 else '',
'?continuation=%s' % continuation_id,
'&playerOffsetMs=%d&hidden=false&pbj=1' % max(offset - 5000, 0) if frag_index > 1 else ''))
success, continuation_id, offset = download_and_parse_fragment(url, frag_index)
if not success:
return False
if test:
break
self._finish_frag_download(ctx)
return True

View File

@@ -7,9 +7,10 @@ try:
from .lazy_extractors import _ALL_CLASSES
_LAZY_LOADER = True
_PLUGIN_CLASSES = []
except ImportError:
_LAZY_LOADER = False
if not _LAZY_LOADER:
from .extractors import *
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals())

View File

@@ -1,14 +1,15 @@
# coding: utf-8
from __future__ import unicode_literals
import calendar
import re
import time
from .amp import AMPIE
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..compat import compat_urlparse
from ..utils import (
parse_duration,
parse_iso8601,
try_get,
)
class AbcNewsVideoIE(AMPIE):
@@ -18,8 +19,8 @@ class AbcNewsVideoIE(AMPIE):
(?:
abcnews\.go\.com/
(?:
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid=
(?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
video/(?:embed|itemfeed)\?.*?\bid=
)|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
)
@@ -36,6 +37,8 @@ class AbcNewsVideoIE(AMPIE):
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1380454200,
'upload_date': '20130929',
},
'params': {
# m3u8 download
@@ -47,6 +50,12 @@ class AbcNewsVideoIE(AMPIE):
}, {
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
'only_matching': True,
}, {
'url': 'http://abcnews.go.com/video/itemfeed?id=46979033',
'only_matching': True,
}, {
'url': 'https://abcnews.go.com/GMA/News/video/history-christmas-story-67894761',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -67,28 +76,23 @@ class AbcNewsIE(InfoExtractor):
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
_TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
# Youtube Embeds
'url': 'https://abcnews.go.com/Entertainment/peter-billingsley-child-actor-christmas-story-hollywood-power/story?id=51286501',
'info_dict': {
'id': '10505354',
'ext': 'flv',
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
'id': '51286501',
'title': "Peter Billingsley: From child actor in 'A Christmas Story' to Hollywood power player",
'description': 'Billingsley went from a child actor to Hollywood power player.',
},
'add_ie': ['AbcNewsVideo'],
'playlist_count': 5,
}, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': {
'id': '38897857',
'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single',
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
'upload_date': '20160515',
'timestamp': 1463329500,
'upload_date': '20160505',
'timestamp': 1462442280,
},
'params': {
# m3u8 download
@@ -100,49 +104,55 @@ class AbcNewsIE(InfoExtractor):
}, {
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}, {
# inline.type == 'video'
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('display_id')
video_id = mobj.group('id')
story_id = self._match_id(url)
webpage = self._download_webpage(url, story_id)
story = self._parse_json(self._search_regex(
r"window\['__abcnews__'\]\s*=\s*({.+?});",
webpage, 'data'), story_id)['page']['content']['story']['everscroll'][0]
article_contents = story.get('articleContents') or {}
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL')
full_video_url = compat_urlparse.urljoin(url, video_url)
def entries():
featured_video = story.get('featuredVideo') or {}
feed = try_get(featured_video, lambda x: x['video']['feed'])
if feed:
yield {
'_type': 'url',
'id': featured_video.get('id'),
'title': featured_video.get('name'),
'url': feed,
'thumbnail': featured_video.get('images'),
'description': featured_video.get('description'),
'timestamp': parse_iso8601(featured_video.get('uploadDate')),
'duration': parse_duration(featured_video.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
youtube_url = YoutubeIE._extract_url(webpage)
for inline in (article_contents.get('inlines') or []):
inline_type = inline.get('type')
if inline_type == 'iframe':
iframe_url = try_get(inline, lambda x: x['attrs']['src'])
if iframe_url:
yield self.url_result(iframe_url)
elif inline_type == 'video':
video_id = inline.get('id')
if video_id:
yield {
'_type': 'url',
'id': video_id,
'url': 'http://abcnews.go.com/video/embed?id=' + video_id,
'thumbnail': inline.get('imgSrc') or inline.get('imgDefault'),
'description': inline.get('description'),
'duration': parse_duration(inline.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
timestamp = None
date_str = self._html_search_regex(
r'<span[^>]+class="timestamp">([^<]+)</span>',
webpage, 'timestamp', fatal=False)
if date_str:
tz_offset = 0
if date_str.endswith(' ET'): # Eastern Time
tz_offset = -5
date_str = date_str[:-3]
date_formats = ['%b. %d, %Y', '%b %d, %Y, %I:%M %p']
for date_format in date_formats:
try:
timestamp = calendar.timegm(time.strptime(date_str.strip(), date_format))
except ValueError:
continue
if timestamp is not None:
timestamp -= tz_offset * 3600
entry = {
'_type': 'url_transparent',
'ie_key': AbcNewsVideoIE.ie_key(),
'url': full_video_url,
'id': video_id,
'display_id': display_id,
'timestamp': timestamp,
}
if youtube_url:
entries = [entry, self.url_result(youtube_url, ie=YoutubeIE.ie_key())]
return self.playlist_result(entries)
return entry
return self.playlist_result(
entries(), story_id, article_contents.get('headline'),
article_contents.get('subHead'))

View File

@@ -26,6 +26,7 @@ from ..utils import (
strip_or_none,
try_get,
unified_strdate,
urlencode_postdata,
)
@@ -51,9 +52,12 @@ class ADNIE(InfoExtractor):
}
}
_NETRC_MACHINE = 'animedigitalnetwork'
_BASE_URL = 'http://animedigitalnetwork.fr'
_API_BASE_URL = 'https://gw.api.animedigitalnetwork.fr/'
_PLAYER_BASE_URL = _API_BASE_URL + 'player/'
_HEADERS = {}
_LOGIN_ERR_MESSAGE = 'Unable to log in'
_RSA_KEY = (0x9B42B08905199A5CCE2026274399CA560ECB209EE9878A708B1C0812E1BB8CB5D1FB7441861147C1A1F2F3A0476DD63A9CAC20D3E983613346850AA6CB38F16DC7D720FD7D86FC6E5B3D5BBC72E14CD0BF9E869F2CEA2CCAD648F1DCE38F1FF916CEFB2D339B64AA0264372344BC775E265E8A852F88144AB0BD9AA06C1A4ABB, 65537)
_POS_ALIGN_MAP = {
'start': 1,
@@ -129,19 +133,42 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
}])
return subtitles
def _real_initialize(self):
username, password = self._get_login_info()
if not username:
return
try:
access_token = (self._download_json(
self._API_BASE_URL + 'authentication/login', None,
'Logging in', self._LOGIN_ERR_MESSAGE, fatal=False,
data=urlencode_postdata({
'password': password,
'rememberMe': False,
'source': 'Web',
'username': username,
})) or {}).get('accessToken')
if access_token:
self._HEADERS = {'authorization': 'Bearer ' + access_token}
except ExtractorError as e:
message = None
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
resp = self._parse_json(
e.cause.read().decode(), None, fatal=False) or {}
message = resp.get('message') or resp.get('code')
self.report_warning(message or self._LOGIN_ERR_MESSAGE)
def _real_extract(self, url):
video_id = self._match_id(url)
video_base_url = self._PLAYER_BASE_URL + 'video/%s/' % video_id
player = self._download_json(
video_base_url + 'configuration', video_id,
'Downloading player config JSON metadata')['player']
'Downloading player config JSON metadata',
headers=self._HEADERS)['player']
options = player['options']
user = options['user']
if not user.get('hasAccess'):
raise ExtractorError(
'This video is only available for paying users', expected=True)
# self.raise_login_required() # FIXME: Login is not implemented
self.raise_login_required()
token = self._download_json(
user.get('refreshTokenUrl') or (self._PLAYER_BASE_URL + 'refresh/token'),
@@ -188,8 +215,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
message = error.get('message')
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country':
self.raise_geo_restricted(msg=message)
else:
raise ExtractorError(message)
raise ExtractorError(message)
else:
raise ExtractorError('Giving up retrying')

View File

@@ -252,7 +252,7 @@ class AENetworksShowIE(AENetworksListBaseIE):
_TESTS = [{
'url': 'http://www.history.com/shows/ancient-aliens',
'info_dict': {
'id': 'SH012427480000',
'id': 'SERIES1574',
'title': 'Ancient Aliens',
'description': 'md5:3f6d74daf2672ff3ae29ed732e37ea7f',
},

View File

@@ -8,6 +8,7 @@ from ..utils import (
int_or_none,
mimetype2ext,
parse_iso8601,
unified_timestamp,
url_or_none,
)
@@ -88,7 +89,7 @@ class AMPIE(InfoExtractor):
self._sort_formats(formats)
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
timestamp = unified_timestamp(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return {
'id': video_id,

View File

@@ -21,6 +21,16 @@ from ..utils import (
unsmuggle_url,
)
# This import causes a ModuleNotFoundError on some systems for unknown reason.
# See issues:
# https://github.com/pukkandan/yt-dlp/issues/35
# https://github.com/ytdl-org/youtube-dl/issues/27449
# https://github.com/animelover1984/youtube-dl/issues/17
try:
from .anvato_token_generator import NFLTokenGenerator
except ImportError:
NFLTokenGenerator = None
def md5_text(s):
if not isinstance(s, compat_str):
@@ -203,6 +213,10 @@ class AnvatoIE(InfoExtractor):
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
}
_TOKEN_GENERATORS = {
'GXvEgwyJeWem8KCYXfeoHWknwP48Mboj': NFLTokenGenerator,
}
_API_KEY = '3hwbSuqqT690uxjNYBktSQpa5ZrpYYR0Iofx7NcJHyA'
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
@@ -262,9 +276,12 @@ class AnvatoIE(InfoExtractor):
'anvrid': anvrid,
'anvts': server_time,
}
api['anvstk'] = md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time,
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
if self._TOKEN_GENERATORS.get(access_key) is not None:
api['anvstk2'] = self._TOKEN_GENERATORS[access_key].generate(self, access_key, video_id)
else:
api['anvstk'] = md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time,
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
return self._download_json(
video_data_url, video_id, transform_source=strip_jsonp,

View File

@@ -0,0 +1,247 @@
# coding: utf-8
from __future__ import unicode_literals
import random
import re
from .common import InfoExtractor
from ..utils import ExtractorError, try_get, compat_str, str_or_none
from ..compat import compat_urllib_parse_unquote
class AudiusBaseIE(InfoExtractor):
_API_BASE = None
_API_V = '/v1'
def _get_response_data(self, response):
if isinstance(response, dict):
response_data = response.get('data')
if response_data is not None:
return response_data
if len(response) == 1 and 'message' in response:
raise ExtractorError('API error: %s' % response['message'],
expected=True)
raise ExtractorError('Unexpected API response')
def _select_api_base(self):
"""Selecting one of the currently available API hosts"""
response = super(AudiusBaseIE, self)._download_json(
'https://api.audius.co/', None,
note='Requesting available API hosts',
errnote='Unable to request available API hosts')
hosts = self._get_response_data(response)
if isinstance(hosts, list):
self._API_BASE = random.choice(hosts)
return
raise ExtractorError('Unable to get available API hosts')
@staticmethod
def _prepare_url(url, title):
"""
Audius removes forward slashes from the uri, but leaves backslashes.
The problem is that the current version of Chrome replaces backslashes
in the address bar with a forward slashes, so if you copy the link from
there and paste it into youtube-dl, you won't be able to download
anything from this link, since the Audius API won't be able to resolve
this url
"""
url = compat_urllib_parse_unquote(url)
title = compat_urllib_parse_unquote(title)
if '/' in title or '%2F' in title:
fixed_title = title.replace('/', '%5C').replace('%2F', '%5C')
return url.replace(title, fixed_title)
return url
def _api_request(self, path, item_id=None, note='Downloading JSON metadata',
errnote='Unable to download JSON metadata',
expected_status=None):
if self._API_BASE is None:
self._select_api_base()
try:
response = super(AudiusBaseIE, self)._download_json(
'%s%s%s' % (self._API_BASE, self._API_V, path), item_id, note=note,
errnote=errnote, expected_status=expected_status)
except ExtractorError as exc:
# some of Audius API hosts may not work as expected and return HTML
if 'Failed to parse JSON' in compat_str(exc):
raise ExtractorError('An error occurred while receiving data. Try again',
expected=True)
raise exc
return self._get_response_data(response)
def _resolve_url(self, url, item_id):
return self._api_request('/resolve?url=%s' % url, item_id,
expected_status=404)
class AudiusIE(AudiusBaseIE):
_VALID_URL = r'''(?x)https?://(?:www\.)?(?:audius\.co/(?P<uploader>[\w\d-]+)(?!/album|/playlist)/(?P<title>\S+))'''
IE_DESC = 'Audius.co'
_TESTS = [
{
# URL from Chrome address bar which replace backslash to forward slash
'url': 'https://audius.co/test_acc/t%D0%B5%D0%B5%D0%B5est-1.%5E_%7B%7D/%22%3C%3E.%E2%84%96~%60-198631',
'md5': '92c35d3e754d5a0f17eef396b0d33582',
'info_dict': {
'id': 'xd8gY',
'title': '''Tеееest/ 1.!@#$%^&*()_+=[]{};'\\\":<>,.?/№~`''',
'ext': 'mp3',
'description': 'Description',
'duration': 30,
'track': '''Tеееest/ 1.!@#$%^&*()_+=[]{};'\\\":<>,.?/№~`''',
'artist': 'test',
'genre': 'Electronic',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
'repost_count': int,
}
},
{
# Regular track
'url': 'https://audius.co/voltra/radar-103692',
'md5': '491898a0a8de39f20c5d6a8a80ab5132',
'info_dict': {
'id': 'KKdy2',
'title': 'RADAR',
'ext': 'mp3',
'duration': 318,
'track': 'RADAR',
'artist': 'voltra',
'genre': 'Trance',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
'repost_count': int,
}
},
]
_ARTWORK_MAP = {
"150x150": 150,
"480x480": 480,
"1000x1000": 1000
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
track_id = try_get(mobj, lambda x: x.group('track_id'))
if track_id is None:
title = mobj.group('title')
# uploader = mobj.group('uploader')
url = self._prepare_url(url, title)
track_data = self._resolve_url(url, title)
else: # API link
title = None
# uploader = None
track_data = self._api_request('/tracks/%s' % track_id, track_id)
if not isinstance(track_data, dict):
raise ExtractorError('Unexpected API response')
track_id = track_data.get('id')
if track_id is None:
raise ExtractorError('Unable to get ID of the track')
artworks_data = track_data.get('artwork')
thumbnails = []
if isinstance(artworks_data, dict):
for quality_key, thumbnail_url in artworks_data.items():
thumbnail = {
"url": thumbnail_url
}
quality_code = self._ARTWORK_MAP.get(quality_key)
if quality_code is not None:
thumbnail['preference'] = quality_code
thumbnails.append(thumbnail)
return {
'id': track_id,
'title': track_data.get('title', title),
'url': '%s/v1/tracks/%s/stream' % (self._API_BASE, track_id),
'ext': 'mp3',
'description': track_data.get('description'),
'duration': track_data.get('duration'),
'track': track_data.get('title'),
'artist': try_get(track_data, lambda x: x['user']['name'], compat_str),
'genre': track_data.get('genre'),
'thumbnails': thumbnails,
'view_count': track_data.get('play_count'),
'like_count': track_data.get('favorite_count'),
'repost_count': track_data.get('repost_count'),
}
class AudiusTrackIE(AudiusIE):
_VALID_URL = r'''(?x)(?:audius:)(?:https?://(?:www\.)?.+/v1/tracks/)?(?P<track_id>\w+)'''
IE_NAME = 'audius:track'
IE_DESC = 'Audius track ID or API link. Prepend with "audius:"'
_TESTS = [
{
'url': 'audius:9RWlo',
'only_matching': True
},
{
'url': 'audius:http://discoveryprovider.audius.prod-us-west-2.staked.cloud/v1/tracks/9RWlo',
'only_matching': True
},
]
class AudiusPlaylistIE(AudiusBaseIE):
_VALID_URL = r'https?://(?:www\.)?audius\.co/(?P<uploader>[\w\d-]+)/(?:album|playlist)/(?P<title>\S+)'
IE_NAME = 'audius:playlist'
IE_DESC = 'Audius.co playlists'
_TEST = {
'url': 'https://audius.co/test_acc/playlist/test-playlist-22910',
'info_dict': {
'id': 'DNvjN',
'title': 'test playlist',
'description': 'Test description\n\nlol',
},
'playlist_count': 175,
}
def _build_playlist(self, tracks):
entries = []
for track in tracks:
if not isinstance(track, dict):
raise ExtractorError('Unexpected API response')
track_id = str_or_none(track.get('id'))
if not track_id:
raise ExtractorError('Unable to get track ID from playlist')
entries.append(self.url_result(
'audius:%s' % track_id,
ie=AudiusTrackIE.ie_key(), video_id=track_id))
return entries
def _real_extract(self, url):
self._select_api_base()
mobj = re.match(self._VALID_URL, url)
title = mobj.group('title')
# uploader = mobj.group('uploader')
url = self._prepare_url(url, title)
playlist_response = self._resolve_url(url, title)
if not isinstance(playlist_response, list) or len(playlist_response) != 1:
raise ExtractorError('Unexpected API response')
playlist_data = playlist_response[0]
if not isinstance(playlist_data, dict):
raise ExtractorError('Unexpected API response')
playlist_id = playlist_data.get('id')
if playlist_id is None:
raise ExtractorError('Unable to get playlist ID')
playlist_tracks = self._api_request(
'/playlists/%s/tracks' % playlist_id,
title, note='Downloading playlist tracks metadata',
errnote='Unable to download playlist tracks metadata')
if not isinstance(playlist_tracks, list):
raise ExtractorError('Unexpected API response')
entries = self._build_playlist(playlist_tracks)
return self.playlist_result(entries, playlist_id,
playlist_data.get('playlist_name', title),
playlist_data.get('description'))

View File

@@ -48,6 +48,7 @@ class AWAANBaseIE(InfoExtractor):
'duration': int_or_none(video_data.get('duration')),
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
'is_live': is_live,
'uploader_id': video_data.get('user_id'),
}
@@ -107,6 +108,7 @@ class AWAANLiveIE(AWAANBaseIE):
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'upload_date': '20150107',
'timestamp': 1420588800,
'uploader_id': '71',
},
'params': {
# m3u8 download

View File

@@ -47,7 +47,7 @@ class AZMedienIE(InfoExtractor):
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
'only_matching': True
}]
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d'
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
_PARTNER_ID = '1719221'
def _real_extract(self, url):

View File

@@ -2,9 +2,10 @@
from __future__ import unicode_literals
import hashlib
import json
import re
from .common import InfoExtractor
from .common import InfoExtractor, SearchInfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
@@ -32,13 +33,14 @@ class BiliBiliIE(InfoExtractor):
(?:
video/[aA][vV]|
anime/(?P<anime_id>\d+)/play\#
)(?P<id_bv>\d+)|
video/[bB][vV](?P<id>[^/?#&]+)
)(?P<id>\d+)|
video/[bB][vV](?P<id_bv>[^/?#&]+)
)
(?:/?\?p=(?P<page>\d+))?
'''
_TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'url': 'http://www.bilibili.com/video/av1074402/',
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788',
'info_dict': {
'id': '1074402',
@@ -56,6 +58,10 @@ class BiliBiliIE(InfoExtractor):
# Tested in BiliBiliBangumiIE
'url': 'http://bangumi.bilibili.com/anime/1869/play#40062',
'only_matching': True,
}, {
# bilibili.tv
'url': 'http://www.bilibili.tv/video/av1074402/',
'only_matching': True,
}, {
'url': 'http://bangumi.bilibili.com/anime/5802/play#100643',
'md5': '3f721ad1e75030cc06faf73587cfec57',
@@ -124,12 +130,20 @@ class BiliBiliIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') or mobj.group('id_bv')
video_id = mobj.group('id_bv') or mobj.group('id')
av_id, bv_id = self._get_video_id_set(video_id, mobj.group('id_bv') is not None)
video_id = av_id
anime_id = mobj.group('anime_id')
page_id = mobj.group('page')
webpage = self._download_webpage(url, video_id)
if 'anime/' not in url:
cid = self._search_regex(
r'\bcid(?:["\']:|=)(\d+),["\']page(?:["\']:|=)' + str(page_id), webpage, 'cid',
default=None
) or self._search_regex(
r'\bcid(?:["\']:|=)(\d+)', webpage, 'cid',
default=None
) or compat_parse_qs(self._search_regex(
@@ -207,9 +221,9 @@ class BiliBiliIE(InfoExtractor):
break
title = self._html_search_regex(
('<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
group='title')
(r'<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
r'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
group='title') + ('_p' + str(page_id) if page_id is not None else '')
description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex(
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time',
@@ -219,7 +233,8 @@ class BiliBiliIE(InfoExtractor):
# TODO 'view_count' requires deobfuscating Javascript
info = {
'id': video_id,
'id': str(video_id) if page_id is None else '%s_p%s' % (video_id, page_id),
'cid': cid,
'title': title,
'description': description,
'timestamp': timestamp,
@@ -235,27 +250,134 @@ class BiliBiliIE(InfoExtractor):
'uploader': uploader_mobj.group('name'),
'uploader_id': uploader_mobj.group('id'),
})
if not info.get('uploader'):
info['uploader'] = self._html_search_meta(
'author', webpage, 'uploader', default=None)
comments = None
if self._downloader.params.get('getcomments', False):
comments = self._get_all_comment_pages(video_id)
raw_danmaku = self._get_raw_danmaku(video_id, cid)
raw_tags = self._get_tags(video_id)
tags = list(map(lambda x: x['tag_name'], raw_tags))
top_level_info = {
'raw_danmaku': raw_danmaku,
'comments': comments,
'comment_count': len(comments) if comments is not None else None,
'tags': tags,
'raw_tags': raw_tags,
}
'''
# Requires https://github.com/m13253/danmaku2ass which is licenced under GPL3
# See https://github.com/animelover1984/youtube-dl
danmaku = NiconicoIE.CreateDanmaku(raw_danmaku, commentType='Bilibili', x=1024, y=576)
entries[0]['subtitles'] = {
'danmaku': [{
'ext': 'ass',
'data': danmaku
}]
}
'''
for entry in entries:
entry.update(info)
if len(entries) == 1:
entries[0].update(top_level_info)
return entries[0]
else:
for idx, entry in enumerate(entries):
entry['id'] = '%s_part%d' % (video_id, (idx + 1))
return {
global_info = {
'_type': 'multi_video',
'id': video_id,
'bv_id': bv_id,
'title': title,
'description': description,
'entries': entries,
}
global_info.update(info)
global_info.update(top_level_info)
return global_info
def _get_video_id_set(self, id, is_bv):
query = {'bvid': id} if is_bv else {'aid': id}
response = self._download_json(
"http://api.bilibili.cn/x/web-interface/view",
id, query=query,
note='Grabbing original ID via API')
if response['code'] == -400:
raise ExtractorError('Video ID does not exist', expected=True, video_id=id)
elif response['code'] != 0:
raise ExtractorError('Unknown error occurred during API check (code %s)' % response['code'], expected=True, video_id=id)
return (response['data']['aid'], response['data']['bvid'])
# recursive solution to getting every page of comments for the video
# we can stop when we reach a page without any comments
def _get_all_comment_pages(self, video_id, commentPageNumber=0):
comment_url = "https://api.bilibili.com/x/v2/reply?jsonp=jsonp&pn=%s&type=1&oid=%s&sort=2&_=1567227301685" % (commentPageNumber, video_id)
json_str = self._download_webpage(
comment_url, video_id,
note='Extracting comments from page %s' % (commentPageNumber))
replies = json.loads(json_str)['data']['replies']
if replies is None:
return []
return self._get_all_children(replies) + self._get_all_comment_pages(video_id, commentPageNumber + 1)
# extracts all comments in the tree
def _get_all_children(self, replies):
if replies is None:
return []
ret = []
for reply in replies:
author = reply['member']['uname']
author_id = reply['member']['mid']
id = reply['rpid']
text = reply['content']['message']
timestamp = reply['ctime']
parent = reply['parent'] if reply['parent'] != 0 else 'root'
comment = {
"author": author,
"author_id": author_id,
"id": id,
"text": text,
"timestamp": timestamp,
"parent": parent,
}
ret.append(comment)
# from the JSON, the comment structure seems arbitrarily deep, but I could be wrong.
# Regardless, this should work.
ret += self._get_all_children(reply['replies'])
return ret
def _get_raw_danmaku(self, video_id, cid):
# This will be useful if I decide to scrape all pages instead of doing them individually
# cid_url = "https://www.bilibili.com/widget/getPageList?aid=%s" % (video_id)
# cid_str = self._download_webpage(cid_url, video_id, note=False)
# cid = json.loads(cid_str)[0]['cid']
danmaku_url = "https://comment.bilibili.com/%s.xml" % (cid)
danmaku = self._download_webpage(danmaku_url, video_id, note='Downloading danmaku comments')
return danmaku
def _get_tags(self, video_id):
tags_url = "https://api.bilibili.com/x/tag/archive/tags?aid=%s" % (video_id)
tags_json = self._download_json(tags_url, video_id, note='Downloading tags')
return tags_json['data']
class BiliBiliBangumiIE(InfoExtractor):
_VALID_URL = r'https?://bangumi\.bilibili\.com/anime/(?P<id>\d+)'
@@ -324,6 +446,73 @@ class BiliBiliBangumiIE(InfoExtractor):
season_info.get('bangumi_title'), season_info.get('evaluate'))
class BilibiliChannelIE(InfoExtractor):
_VALID_URL = r'https?://space.bilibili\.com/(?P<id>\d+)'
# May need to add support for pagination? Need to find a user with many video uploads to test
_API_URL = "https://api.bilibili.com/x/space/arc/search?mid=%s&pn=1&ps=25&jsonp=jsonp"
_TEST = {} # TODO: Add tests
def _real_extract(self, url):
list_id = self._match_id(url)
json_str = self._download_webpage(self._API_URL % list_id, "None")
json_parsed = json.loads(json_str)
entries = [{
'_type': 'url',
'ie_key': BiliBiliIE.ie_key(),
'url': ('https://www.bilibili.com/video/%s' %
entry['bvid']),
'id': entry['bvid'],
} for entry in json_parsed['data']['list']['vlist']]
return {
'_type': 'playlist',
'id': list_id,
'entries': entries
}
class BiliBiliSearchIE(SearchInfoExtractor):
IE_DESC = 'Bilibili video search, "bilisearch" keyword'
_MAX_RESULTS = 100000
_SEARCH_KEY = 'bilisearch'
MAX_NUMBER_OF_RESULTS = 1000
def _get_n_results(self, query, n):
"""Get a specified number of results for a query"""
entries = []
pageNumber = 0
while True:
pageNumber += 1
# FIXME
api_url = "https://api.bilibili.com/x/web-interface/search/type?context=&page=%s&order=pubdate&keyword=%s&duration=0&tids_2=&__refresh__=true&search_type=video&tids=0&highlight=1" % (pageNumber, query)
json_str = self._download_webpage(
api_url, "None", query={"Search_key": query},
note='Extracting results from page %s' % pageNumber)
data = json.loads(json_str)['data']
# FIXME: this is hideous
if "result" not in data:
return {
'_type': 'playlist',
'id': query,
'entries': entries[:n]
}
videos = data['result']
for video in videos:
e = self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))
entries.append(e)
if(len(entries) >= n or len(videos) >= BiliBiliSearchIE.MAX_NUMBER_OF_RESULTS):
return {
'_type': 'playlist',
'id': query,
'entries': entries[:n]
}
class BilibiliAudioBaseIE(InfoExtractor):
def _call_api(self, path, sid, query=None):
if not query:

View File

@@ -90,13 +90,19 @@ class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
_TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'md5': '670b2d73f48549da032861130488c681',
'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'flv',
'ext': 'mp4',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
'upload_date': '20150723',
'timestamp': 1437679032,
},
'expected_warnings': [
'Unable to download f4m manifest'
]
}]
def _real_extract(self, url):

View File

@@ -12,7 +12,7 @@ from ..utils import (
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?P<req_id>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
@@ -28,10 +28,13 @@ class BravoTVIE(AdobePassIE):
}, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True,
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
site, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
@@ -53,11 +56,14 @@ class BravoTVIE(AdobePassIE):
tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('tve_adobe_auth', {})
if site == 'bravotv':
site = 'bravo'
resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'),
adobe_pass.get('adobePassResourceId') or site,
tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
url, release_pid,
adobe_pass.get('adobePassRequestorId') or site, resource)
else:
shared_playlist = settings['ls_playlist']
account_pid = shared_playlist['account_pid']

View File

@@ -1,6 +1,7 @@
# coding: utf-8
from __future__ import unicode_literals
import datetime
import re
from .common import InfoExtractor
@@ -8,8 +9,8 @@ from ..utils import (
clean_html,
int_or_none,
parse_duration,
parse_iso8601,
parse_resolution,
try_get,
url_or_none,
)
@@ -24,8 +25,9 @@ class CCMAIE(InfoExtractor):
'ext': 'mp4',
'title': 'L\'espot de La Marató de TV3',
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
'timestamp': 1470918540,
'upload_date': '20160811',
'timestamp': 1478608140,
'upload_date': '20161108',
'age_limit': 0,
}
}, {
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
@@ -35,8 +37,24 @@ class CCMAIE(InfoExtractor):
'ext': 'mp3',
'title': 'El Consell de Savis analitza el derbi',
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
'upload_date': '20171205',
'timestamp': 1512507300,
'upload_date': '20170512',
'timestamp': 1494622500,
'vcodec': 'none',
'categories': ['Esports'],
}
}, {
'url': 'http://www.ccma.cat/tv3/alacarta/crims/crims-josep-tallada-lespereu-me-capitol-1/video/6031387/',
'md5': 'b43c3d3486f430f3032b5b160d80cbc3',
'info_dict': {
'id': '6031387',
'ext': 'mp4',
'title': 'Crims - Josep Talleda, l\'"Espereu-me" (capítol 1)',
'description': 'md5:7cbdafb640da9d0d2c0f62bad1e74e60',
'timestamp': 1582577700,
'upload_date': '20200224',
'subtitles': 'mincount:4',
'age_limit': 16,
'series': 'Crims',
}
}]
@@ -72,17 +90,27 @@ class CCMAIE(InfoExtractor):
informacio = media['informacio']
title = informacio['titol']
durada = informacio.get('durada', {})
durada = informacio.get('durada') or {}
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
tematica = try_get(informacio, lambda x: x['tematica']['text'])
timestamp = None
data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
try:
timestamp = datetime.datetime.strptime(
data_utc, '%Y-%d-%mT%H:%M:%S%z').timestamp()
except TypeError:
pass
subtitles = {}
subtitols = media.get('subtitols', {})
if subtitols:
sub_url = subtitols.get('url')
subtitols = media.get('subtitols') or []
if isinstance(subtitols, dict):
subtitols = [subtitols]
for st in subtitols:
sub_url = st.get('url')
if sub_url:
subtitles.setdefault(
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
st.get('iso') or st.get('text') or 'ca', []).append({
'url': sub_url,
})
@@ -97,6 +125,16 @@ class CCMAIE(InfoExtractor):
'height': int_or_none(imatges.get('alcada')),
}]
age_limit = None
codi_etic = try_get(informacio, lambda x: x['codi_etic']['id'])
if codi_etic:
codi_etic_s = codi_etic.split('_')
if len(codi_etic_s) == 2:
if codi_etic_s[1] == 'TP':
age_limit = 0
else:
age_limit = int_or_none(codi_etic_s[1])
return {
'id': media_id,
'title': title,
@@ -106,4 +144,9 @@ class CCMAIE(InfoExtractor):
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
'age_limit': age_limit,
'alt_title': informacio.get('titol_complet'),
'episode_number': int_or_none(informacio.get('capitol')),
'categories': [tematica] if tematica else None,
'series': informacio.get('programa'),
}

View File

@@ -96,7 +96,7 @@ class CDAIE(InfoExtractor):
raise ExtractorError('This video is only available for premium users.', expected=True)
need_confirm_age = False
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")',
if self._html_search_regex(r'(<form[^>]+action="[^"]*/a/validatebirth[^"]*")',
webpage, 'birthday validate form', default=None):
webpage = self._download_age_confirm_page(
url, video_id, note='Confirming age')

View File

@@ -336,9 +336,8 @@ class InfoExtractor(object):
There must be a key "entries", which is a list, an iterable, or a PagedList
object, each element of which is a valid dictionary by this specification.
Additionally, playlists can have "id", "title", "description", "uploader",
"uploader_id", "uploader_url", "duration" attributes with the same semantics
as videos (see above).
Additionally, playlists can have "id", "title", and any other relevent
attributes with the same semantics as videos (see above).
_type "multi_video" indicates that there are multiple videos that
@@ -967,15 +966,16 @@ class InfoExtractor(object):
urls, playlist_id=playlist_id, playlist_title=playlist_title)
@staticmethod
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None, **kwargs):
"""Returns a playlist"""
video_info = {'_type': 'playlist',
'entries': entries}
video_info.update(kwargs)
if playlist_id:
video_info['id'] = playlist_id
if playlist_title:
video_info['title'] = playlist_title
if playlist_description:
if playlist_description is not None:
video_info['description'] = playlist_description
return video_info
@@ -1366,16 +1366,16 @@ class InfoExtractor(object):
class FormatSort:
regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<seperator>[~:])(?P<limit>.*?))?)? *$'
default = ('hidden', 'has_video', 'extractor', 'lang', 'quality',
'res', 'fps', 'codec', 'size', 'br', 'asr',
'proto', 'ext', 'has_audio', 'source', 'format_id')
default = ('hidden', 'hasvid', 'ie_pref', 'lang', 'quality',
'res', 'fps', 'codec:vp9', 'size', 'br', 'asr',
'proto', 'ext', 'has_audio', 'source', 'format_id') # These must not be aliases
settings = {
'vcodec': {'type': 'ordered', 'regex': True,
'order': ['vp9', '(h265|he?vc?)', '(h264|avc)', 'vp8', '(mp4v|h263)', 'theora', '', None, 'none']},
'order': ['av0?1', 'vp9', '(h265|he?vc?)', '(h264|avc)', 'vp8', '(mp4v|h263)', 'theora', '', None, 'none']},
'acodec': {'type': 'ordered', 'regex': True,
'order': ['opus', 'vorbis', 'aac', 'mp?4a?', 'mp3', 'e?a?c-?3', 'dts', '', None, 'none']},
'proto': {'type': 'ordered', 'regex': True,
'proto': {'type': 'ordered', 'regex': True, 'field': 'protocol',
'order': ['(ht|f)tps', '(ht|f)tp$', 'm3u8.+', 'm3u8', '.*dash', '', 'mms|rtsp', 'none', 'f4']},
'vext': {'type': 'ordered', 'field': 'video_ext',
'order': ('mp4', 'webm', 'flv', '', 'none'),
@@ -1384,14 +1384,14 @@ class InfoExtractor(object):
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'ie_pref': {'priority': True, 'type': 'extractor'},
'ie_pref': {'priority': True, 'type': 'extractor', 'field': 'extractor_preference'},
'hasvid': {'priority': True, 'field': 'vcodec', 'type': 'boolean', 'not_in_list': ('none',)},
'hasaud': {'field': 'acodec', 'type': 'boolean', 'not_in_list': ('none',)},
'lang': {'priority': True, 'convert': 'ignore'},
'lang': {'priority': True, 'convert': 'ignore', 'field': 'language_preference'},
'quality': {'priority': True, 'convert': 'float_none'},
'filesize': {'convert': 'bytes'},
'fs_approx': {'convert': 'bytes'},
'id': {'convert': 'string'},
'fs_approx': {'convert': 'bytes', 'field': 'filesize_approx'},
'id': {'convert': 'string', 'field': 'format_id'},
'height': {'convert': 'float_none'},
'width': {'convert': 'float_none'},
'fps': {'convert': 'float_none'},
@@ -1399,7 +1399,7 @@ class InfoExtractor(object):
'vbr': {'convert': 'float_none'},
'abr': {'convert': 'float_none'},
'asr': {'convert': 'float_none'},
'source': {'convert': 'ignore'},
'source': {'convert': 'ignore', 'field': 'source_preference'},
'codec': {'type': 'combined', 'field': ('vcodec', 'acodec')},
'br': {'type': 'combined', 'field': ('tbr', 'vbr', 'abr'), 'same_limit': True},
@@ -2264,7 +2264,7 @@ class InfoExtractor(object):
})
return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}):
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle(
mpd_url, video_id,
note=note or 'Downloading MPD manifest',
@@ -2278,10 +2278,9 @@ class InfoExtractor(object):
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url,
formats_dict=formats_dict, mpd_url=mpd_url)
mpd_doc, mpd_id, mpd_base_url, mpd_url)
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
"""
Parse formats from MPD manifest.
References:
@@ -2560,15 +2559,7 @@ class InfoExtractor(object):
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
# is not necessarily unique within a Period thus formats with
# the same `format_id` are quite possible. There are numerous examples
# of such manifests (see https://github.com/ytdl-org/youtube-dl/issues/15111,
# https://github.com/ytdl-org/youtube-dl/issues/13919)
full_info = formats_dict.get(representation_id, {}).copy()
full_info.update(f)
formats.append(full_info)
formats.append(f)
else:
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats

View File

@@ -12,7 +12,14 @@ from ..utils import (
)
class EggheadCourseIE(InfoExtractor):
class EggheadBaseIE(InfoExtractor):
def _call_api(self, path, video_id, resource, fatal=True):
return self._download_json(
'https://app.egghead.io/api/v1/' + path,
video_id, 'Downloading %s JSON' % resource, fatal=fatal)
class EggheadCourseIE(EggheadBaseIE):
IE_DESC = 'egghead.io course'
IE_NAME = 'egghead:course'
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
@@ -28,10 +35,9 @@ class EggheadCourseIE(InfoExtractor):
def _real_extract(self, url):
playlist_id = self._match_id(url)
lessons = self._download_json(
'https://egghead.io/api/v1/series/%s/lessons' % playlist_id,
playlist_id, 'Downloading course lessons JSON')
series_path = 'series/' + playlist_id
lessons = self._call_api(
series_path + '/lessons', playlist_id, 'course lessons')
entries = []
for lesson in lessons:
@@ -44,9 +50,8 @@ class EggheadCourseIE(InfoExtractor):
entries.append(self.url_result(
lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id))
course = self._download_json(
'https://egghead.io/api/v1/series/%s' % playlist_id,
playlist_id, 'Downloading course JSON', fatal=False) or {}
course = self._call_api(
series_path, playlist_id, 'course', False) or {}
playlist_id = course.get('id')
if playlist_id:
@@ -57,7 +62,7 @@ class EggheadCourseIE(InfoExtractor):
course.get('description'))
class EggheadLessonIE(InfoExtractor):
class EggheadLessonIE(EggheadBaseIE):
IE_DESC = 'egghead.io lesson'
IE_NAME = 'egghead:lesson'
_VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)'
@@ -74,7 +79,7 @@ class EggheadLessonIE(InfoExtractor):
'upload_date': '20161209',
'duration': 304,
'view_count': 0,
'tags': ['javascript', 'free'],
'tags': 'count:2',
},
'params': {
'skip_download': True,
@@ -88,8 +93,8 @@ class EggheadLessonIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
lesson = self._download_json(
'https://egghead.io/api/v1/lessons/%s' % display_id, display_id)
lesson = self._call_api(
'lessons/' + display_id, display_id, 'lesson')
lesson_id = compat_str(lesson['id'])
title = lesson['title']

View File

@@ -90,6 +90,11 @@ from .atvat import ATVAtIE
from .audimedia import AudiMediaIE
from .audioboom import AudioBoomIE
from .audiomack import AudiomackIE, AudiomackAlbumIE
from .audius import (
AudiusIE,
AudiusTrackIE,
AudiusPlaylistIE
)
from .awaan import (
AWAANIE,
AWAANVideoIE,
@@ -122,10 +127,12 @@ from .bigflix import BigflixIE
from .bild import BildIE
from .bilibili import (
BiliBiliIE,
BiliBiliSearchIE,
BiliBiliBangumiIE,
BilibiliAudioIE,
BilibiliAudioAlbumIE,
BiliBiliPlayerIE,
BilibiliChannelIE,
)
from .biobiochiletv import BioBioChileTVIE
from .bitchute import (
@@ -1301,6 +1308,7 @@ from .tv2 import (
TV2IE,
TV2ArticleIE,
KatsomoIE,
MTVUutisetArticleIE,
)
from .tv2dk import (
TV2DKIE,
@@ -1441,7 +1449,6 @@ from .vidme import (
VidmeUserIE,
VidmeUserLikesIE,
)
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE
from .viewlift import (
ViewLiftIE,
@@ -1501,6 +1508,7 @@ from .vrv import (
VRVSeriesIE,
)
from .vshare import VShareIE
from .vtm import VTMIE
from .medialaan import MedialaanIE
from .vube import VubeIE
from .vuclip import VuClipIE

View File

@@ -131,6 +131,7 @@ from .gedi import GediEmbedsIE
from .rcs import RCSEmbedsIE
from .bitchute import BitChuteIE
from .arcpublishing import ArcPublishingIE
from .medialaan import MedialaanIE
class GenericIE(InfoExtractor):
@@ -2224,6 +2225,20 @@ class GenericIE(InfoExtractor):
'duration': 1581,
},
},
{
# MyChannels SDK embed
# https://www.24kitchen.nl/populair/deskundige-dit-waarom-sommigen-gevoelig-zijn-voor-voedselallergieen
'url': 'https://www.demorgen.be/nieuws/burgemeester-rotterdam-richt-zich-in-videoboodschap-tot-relschoppers-voelt-het-goed~b0bcfd741/',
'md5': '90c0699c37006ef18e198c032d81739c',
'info_dict': {
'id': '194165',
'ext': 'mp4',
'title': 'Burgemeester Aboutaleb spreekt relschoppers toe',
'timestamp': 1611740340,
'upload_date': '20210127',
'duration': 159,
},
},
]
def report_following_redirect(self, new_url):
@@ -2463,6 +2478,9 @@ class GenericIE(InfoExtractor):
webpage = self._webpage_read_content(
full_response, url, video_id, prefix=first_bytes)
if '<title>DPG Media Privacy Gate</title>' in webpage:
webpage = self._download_webpage(url, video_id)
self.report_extraction(video_id)
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
@@ -2594,6 +2612,11 @@ class GenericIE(InfoExtractor):
if arc_urls:
return self.playlist_from_matches(arc_urls, video_id, video_title, ie=ArcPublishingIE.ie_key())
mychannels_urls = MedialaanIE._extract_urls(webpage)
if mychannels_urls:
return self.playlist_from_matches(
mychannels_urls, video_id, video_title, ie=MedialaanIE.ie_key())
# Look for embedded rtl.nl player
matches = re.findall(
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',

View File

@@ -7,6 +7,7 @@ from ..compat import compat_parse_qs
from ..utils import (
determine_ext,
ExtractorError,
get_element_by_class,
int_or_none,
lowercase_escape,
try_get,
@@ -237,7 +238,7 @@ class GoogleDriveIE(InfoExtractor):
if confirmation_webpage:
confirm = self._search_regex(
r'confirm=([^&"\']+)', confirmation_webpage,
'confirmation code', fatal=False)
'confirmation code', default=None)
if confirm:
confirmed_source_url = update_url_query(source_url, {
'confirm': confirm,
@@ -245,6 +246,11 @@ class GoogleDriveIE(InfoExtractor):
urlh = request_source_file(confirmed_source_url, 'confirmed source')
if urlh and urlh.headers.get('Content-Disposition'):
add_source_format(urlh)
else:
self.report_warning(
get_element_by_class('uc-error-subcaption', confirmation_webpage)
or get_element_by_class('uc-error-caption', confirmation_webpage)
or 'unable to extract confirmation code')
if not formats and reason:
raise ExtractorError(reason, expected=True)

View File

@@ -2,268 +2,113 @@ from __future__ import unicode_literals
import re
from .gigya import GigyaBaseIE
from ..compat import compat_str
from .common import InfoExtractor
from ..utils import (
extract_attributes,
int_or_none,
parse_duration,
try_get,
unified_timestamp,
mimetype2ext,
parse_iso8601,
)
class MedialaanIE(GigyaBaseIE):
class MedialaanIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:www\.|nieuws\.)?
(?:
(?P<site_id>vtm|q2|vtmkzoom)\.be/
(?:
video(?:/[^/]+/id/|/?\?.*?\baid=)|
(?:[^/]+/)*
)
(?:embed\.)?mychannels.video/embed/|
embed\.mychannels\.video/(?:s(?:dk|cript)/)?production/|
(?:www\.)?(?:
(?:
7sur7|
demorgen|
hln|
joe|
qmusic
)\.be|
(?:
[abe]d|
bndestem|
destentor|
gelderlander|
pzc|
tubantia|
volkskrant
)\.nl
)/video/(?:[^/]+/)*[^/?&#]+~p
)
(?P<id>[^/?#&]+)
(?P<id>\d+)
'''
_NETRC_MACHINE = 'medialaan'
_APIKEY = '3_HZ0FtkMW_gOyKlqQzW5_0FHRC7Nd5XpXJZcDdXY4pk5eES2ZWmejRW5egwVm4ug-'
_SITE_TO_APP_ID = {
'vtm': 'vtm_watch',
'q2': 'q2',
'vtmkzoom': 'vtmkzoom',
}
_TESTS = [{
# vod
'url': 'http://vtm.be/video/volledige-afleveringen/id/vtm_20170219_VM0678361_vtmwatch',
'url': 'https://www.bndestem.nl/video/de-terugkeer-van-ally-de-aap-en-wie-vertrekt-er-nog-bij-nac~p193993',
'info_dict': {
'id': 'vtm_20170219_VM0678361_vtmwatch',
'id': '193993',
'ext': 'mp4',
'title': 'Allemaal Chris afl. 6',
'description': 'md5:4be86427521e7b07e0adb0c9c554ddb2',
'timestamp': 1487533280,
'upload_date': '20170219',
'duration': 2562,
'series': 'Allemaal Chris',
'season': 'Allemaal Chris',
'season_number': 1,
'season_id': '256936078124527',
'episode': 'Allemaal Chris afl. 6',
'episode_number': 6,
'episode_id': '256936078591527',
'title': 'De terugkeer van Ally de Aap en wie vertrekt er nog bij NAC?',
'timestamp': 1611663540,
'upload_date': '20210126',
'duration': 238,
},
'params': {
'skip_download': True,
},
'skip': 'Requires account credentials',
}, {
# clip
'url': 'http://vtm.be/video?aid=168332',
'info_dict': {
'id': '168332',
'ext': 'mp4',
'title': '"Veronique liegt!"',
'description': 'md5:1385e2b743923afe54ba4adc38476155',
'timestamp': 1489002029,
'upload_date': '20170308',
'duration': 96,
},
}, {
# vod
'url': 'http://vtm.be/video/volledige-afleveringen/id/257107153551000',
'url': 'https://www.gelderlander.nl/video/kanalen/degelderlander~c320/series/snel-nieuws~s984/noodbevel-in-doetinchem-politie-stuurt-mensen-centrum-uit~p194093',
'only_matching': True,
}, {
# vod
'url': 'http://vtm.be/video?aid=163157',
'url': 'https://embed.mychannels.video/sdk/production/193993?options=TFTFF_default',
'only_matching': True,
}, {
# vod
'url': 'http://www.q2.be/video/volledige-afleveringen/id/2be_20170301_VM0684442_q2',
'url': 'https://embed.mychannels.video/script/production/193993',
'only_matching': True,
}, {
# clip
'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
'url': 'https://embed.mychannels.video/production/193993',
'only_matching': True,
}, {
# http/s redirect
'url': 'https://vtmkzoom.be/video?aid=45724',
'info_dict': {
'id': '257136373657000',
'ext': 'mp4',
'title': 'K3 Dansstudio Ushuaia afl.6',
},
'params': {
'skip_download': True,
},
'skip': 'Requires account credentials',
'url': 'https://mychannels.video/embed/193993',
'only_matching': True,
}, {
# nieuws.vtm.be
'url': 'https://nieuws.vtm.be/stadion/stadion/genk-nog-moeilijk-programma',
'url': 'https://embed.mychannels.video/embed/193993',
'only_matching': True,
}]
def _real_initialize(self):
self._logged_in = False
def _login(self):
username, password = self._get_login_info()
if username is None:
self.raise_login_required()
auth_data = {
'APIKey': self._APIKEY,
'sdk': 'js_6.1',
'format': 'json',
'loginID': username,
'password': password,
}
auth_info = self._gigya_login(auth_data)
self._uid = auth_info['UID']
self._uid_signature = auth_info['UIDSignature']
self._signature_timestamp = auth_info['signatureTimestamp']
self._logged_in = True
@staticmethod
def _extract_urls(webpage):
entries = []
for element in re.findall(r'(<div[^>]+data-mychannels-type="video"[^>]*>)', webpage):
mychannels_id = extract_attributes(element).get('data-mychannels-id')
if mychannels_id:
entries.append('https://mychannels.video/embed/' + mychannels_id)
return entries
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id, site_id = mobj.group('id', 'site_id')
production_id = self._match_id(url)
production = self._download_json(
'https://embed.mychannels.video/sdk/production/' + production_id,
production_id, query={'options': 'UUUU_default'})['productions'][0]
title = production['title']
webpage = self._download_webpage(url, video_id)
config = self._parse_json(
self._search_regex(
r'videoJSConfig\s*=\s*JSON\.parse\(\'({.+?})\'\);',
webpage, 'config', default='{}'), video_id,
transform_source=lambda s: s.replace(
'\\\\', '\\').replace(r'\"', '"').replace(r"\'", "'"))
vod_id = config.get('vodId') or self._search_regex(
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
r'"vodId"\s*:\s*"(.+?)"',
r'<[^>]+id=["\']vod-(\d+)'),
webpage, 'video_id', default=None)
# clip, no authentication required
if not vod_id:
player = self._parse_json(
self._search_regex(
r'vmmaplayer\(({.+?})\);', webpage, 'vmma player',
default=''),
video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
if player:
video = player[-1]
if video['videoUrl'] in ('http', 'https'):
return self.url_result(video['url'], MedialaanIE.ie_key())
info = {
'id': video_id,
'url': video['videoUrl'],
'title': video['title'],
'thumbnail': video.get('imageUrl'),
'timestamp': int_or_none(video.get('createdDate')),
'duration': int_or_none(video.get('duration')),
}
formats = []
for source in (production.get('sources') or []):
src = source.get('src')
if not src:
continue
ext = mimetype2ext(source.get('type'))
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
src, production_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
info = self._parse_html5_media_entries(
url, webpage, video_id, m3u8_id='hls')[0]
info.update({
'id': video_id,
'title': self._html_search_meta('description', webpage),
'duration': parse_duration(self._html_search_meta('duration', webpage)),
formats.append({
'ext': ext,
'url': src,
})
# vod, authentication required
else:
if not self._logged_in:
self._login()
self._sort_formats(formats)
settings = self._parse_json(
self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
webpage, 'drupal settings', default='{}'),
video_id)
def get(container, item):
return try_get(
settings, lambda x: x[container][item],
compat_str) or self._search_regex(
r'"%s"\s*:\s*"([^"]+)' % item, webpage, item,
default=None)
app_id = get('vod', 'app_id') or self._SITE_TO_APP_ID.get(site_id, 'vtm_watch')
sso = get('vod', 'gigyaDatabase') or 'vtm-sso'
data = self._download_json(
'http://vod.medialaan.io/api/1.0/item/%s/video' % vod_id,
video_id, query={
'app_id': app_id,
'user_network': sso,
'UID': self._uid,
'UIDSignature': self._uid_signature,
'signatureTimestamp': self._signature_timestamp,
})
formats = self._extract_m3u8_formats(
data['response']['uri'], video_id, entry_protocol='m3u8_native',
ext='mp4', m3u8_id='hls')
self._sort_formats(formats)
info = {
'id': vod_id,
'formats': formats,
}
api_key = get('vod', 'apiKey')
channel = get('medialaanGigya', 'channel')
if api_key:
videos = self._download_json(
'http://vod.medialaan.io/vod/v2/videos', video_id, fatal=False,
query={
'channels': channel,
'ids': vod_id,
'limit': 1,
'apikey': api_key,
})
if videos:
video = try_get(
videos, lambda x: x['response']['videos'][0], dict)
if video:
def get(container, item, expected_type=None):
return try_get(
video, lambda x: x[container][item], expected_type)
def get_string(container, item):
return get(container, item, compat_str)
info.update({
'series': get_string('program', 'title'),
'season': get_string('season', 'title'),
'season_number': int_or_none(get('season', 'number')),
'season_id': get_string('season', 'id'),
'episode': get_string('episode', 'title'),
'episode_number': int_or_none(get('episode', 'number')),
'episode_id': get_string('episode', 'id'),
'duration': int_or_none(
video.get('duration')) or int_or_none(
video.get('durationMillis'), scale=1000),
'title': get_string('episode', 'title'),
'description': get_string('episode', 'text'),
'timestamp': unified_timestamp(get_string(
'publication', 'begin')),
})
if not info.get('title'):
info['title'] = try_get(
config, lambda x: x['videoConfig']['title'],
compat_str) or self._html_search_regex(
r'\\"title\\"\s*:\s*\\"(.+?)\\"', webpage, 'title',
default=None) or self._og_search_title(webpage)
if not info.get('description'):
info['description'] = self._html_search_regex(
r'<div[^>]+class="field-item\s+even">\s*<p>(.+?)</p>',
webpage, 'description', default=None)
return info
return {
'id': production_id,
'title': title,
'formats': formats,
'thumbnail': production.get('posterUrl'),
'timestamp': parse_iso8601(production.get('publicationDate'), ' '),
'duration': int_or_none(production.get('duration')) or None,
}

View File

@@ -22,11 +22,15 @@ from ..utils import (
orderedSet,
remove_quotes,
str_to_int,
update_url_query,
urlencode_postdata,
url_or_none,
)
class PornHubBaseIE(InfoExtractor):
_NETRC_MACHINE = 'pornhub'
def _download_webpage_handle(self, *args, **kwargs):
def dl(*args, **kwargs):
return super(PornHubBaseIE, self)._download_webpage_handle(*args, **kwargs)
@@ -52,6 +56,66 @@ class PornHubBaseIE(InfoExtractor):
return webpage, urlh
def _real_initialize(self):
self._logged_in = False
def _login(self, host):
if self._logged_in:
return
site = host.split('.')[0]
# Both sites pornhub and pornhubpremium have separate accounts
# so there should be an option to provide credentials for both.
# At the same time some videos are available under the same video id
# on both sites so that we have to identify them as the same video.
# For that purpose we have to keep both in the same extractor
# but under different netrc machines.
username, password = self._get_login_info(netrc_machine=site)
if username is None:
return
login_url = 'https://www.%s/%slogin' % (host, 'premium/' if 'premium' in host else '')
login_page = self._download_webpage(
login_url, None, 'Downloading %s login page' % site)
def is_logged(webpage):
return any(re.search(p, webpage) for p in (
r'class=["\']signOut',
r'>Sign\s+[Oo]ut\s*<'))
if is_logged(login_page):
self._logged_in = True
return
login_form = self._hidden_inputs(login_page)
login_form.update({
'username': username,
'password': password,
})
response = self._download_json(
'https://www.%s/front/authenticate' % host, None,
'Logging in to %s' % site,
data=urlencode_postdata(login_form),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Referer': login_url,
'X-Requested-With': 'XMLHttpRequest',
})
if response.get('success') == '1':
self._logged_in = True
return
message = response.get('message')
if message is not None:
raise ExtractorError(
'Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class PornHubIE(PornHubBaseIE):
IE_DESC = 'PornHub and Thumbzilla'
@@ -163,12 +227,20 @@ class PornHubIE(PornHubBaseIE):
}, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5e4acdae54a82',
'only_matching': True,
}, {
# Some videos are available with the same id on both premium
# and non-premium sites (e.g. this and the following test)
'url': 'https://www.pornhub.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return re.findall(
r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub\.(?:com|net|org)/embed/[\da-z]+)',
r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub(?:premium)?\.(?:com|net|org)/embed/[\da-z]+)',
webpage)
def _extract_count(self, pattern, webpage, name):
@@ -180,12 +252,7 @@ class PornHubIE(PornHubBaseIE):
host = mobj.group('host') or 'pornhub.com'
video_id = mobj.group('id')
if 'premium' in host:
if not self._downloader.params.get('cookiefile'):
raise ExtractorError(
'PornHub Premium requires authentication.'
' You may want to use --cookies.',
expected=True)
self._login(host)
self._set_cookie(host, 'age_verified', '1')
@@ -405,6 +472,10 @@ class PornHubIE(PornHubBaseIE):
class PornHubPlaylistBaseIE(PornHubBaseIE):
def _extract_page(self, url):
return int_or_none(self._search_regex(
r'\bpage=(\d+)', url, 'page', default=None))
def _extract_entries(self, webpage, host):
# Only process container div with main playlist content skipping
# drop-down menu that uses similar pattern for videos (see
@@ -422,26 +493,6 @@ class PornHubPlaylistBaseIE(PornHubBaseIE):
container))
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
playlist_id = mobj.group('id')
webpage = self._download_webpage(url, playlist_id)
entries = self._extract_entries(webpage, host)
playlist = self._parse_json(
self._search_regex(
r'(?:playlistObject|PLAYLIST_VIEW)\s*=\s*({.+?});', webpage,
'playlist', default='{}'),
playlist_id, fatal=False)
title = playlist.get('title') or self._search_regex(
r'>Videos\s+in\s+(.+?)\s+[Pp]laylist<', webpage, 'title', fatal=False)
return self.playlist_result(
entries, playlist_id, title, playlist.get('description'))
class PornHubUserIE(PornHubPlaylistBaseIE):
_VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net|org))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)'
@@ -463,14 +514,27 @@ class PornHubUserIE(PornHubPlaylistBaseIE):
}, {
'url': 'https://www.pornhub.com/model/zoe_ph?abc=1',
'only_matching': True,
}, {
# Unavailable via /videos page, but available with direct pagination
# on pornstar page (see [1]), requires premium
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
'url': 'https://www.pornhubpremium.com/pornstar/sienna-west',
'only_matching': True,
}, {
# Same as before, multi page
'url': 'https://www.pornhubpremium.com/pornstar/lily-labeau',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user_id = mobj.group('id')
videos_url = '%s/videos' % mobj.group('url')
page = self._extract_page(url)
if page:
videos_url = update_url_query(videos_url, {'page': page})
return self.url_result(
'%s/videos' % mobj.group('url'), ie=PornHubPagedVideoListIE.ie_key(),
video_id=user_id)
videos_url, ie=PornHubPagedVideoListIE.ie_key(), video_id=user_id)
class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
@@ -483,32 +547,55 @@ class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
<button[^>]+\bid=["\']moreDataBtn
''', webpage) is not None
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
item_id = mobj.group('id')
def _entries(self, url, host, item_id):
page = self._extract_page(url)
page = int_or_none(self._search_regex(
r'\bpage=(\d+)', url, 'page', default=None))
VIDEOS = '/videos'
entries = []
for page_num in (page, ) if page is not None else itertools.count(1):
def download_page(base_url, num, fallback=False):
note = 'Downloading page %d%s' % (num, ' (switch to fallback)' if fallback else '')
return self._download_webpage(
base_url, item_id, note, query={'page': num})
def is_404(e):
return isinstance(e.cause, compat_HTTPError) and e.cause.code == 404
base_url = url
has_page = page is not None
first_page = page if has_page else 1
for page_num in (first_page, ) if has_page else itertools.count(first_page):
try:
webpage = self._download_webpage(
url, item_id, 'Downloading page %d' % page_num,
query={'page': page_num})
try:
webpage = download_page(base_url, page_num)
except ExtractorError as e:
# Some sources may not be available via /videos page,
# trying to fallback to main page pagination (see [1])
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
if is_404(e) and page_num == first_page and VIDEOS in base_url:
base_url = base_url.replace(VIDEOS, '')
webpage = download_page(base_url, page_num, fallback=True)
else:
raise
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
if is_404(e) and page_num != first_page:
break
raise
page_entries = self._extract_entries(webpage, host)
if not page_entries:
break
entries.extend(page_entries)
for e in page_entries:
yield e
if not self._has_more(webpage):
break
return self.playlist_result(orderedSet(entries), item_id)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
item_id = mobj.group('id')
self._login(host)
return self.playlist_result(self._entries(url, host, item_id), item_id)
class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE):

View File

@@ -255,8 +255,10 @@ class SVTPlayIE(SVTPlayBaseIE):
svt_id = self._search_regex(
(r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']videoSvtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)',
r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"',
r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)'),
r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']svtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)'),
webpage, 'video id')
info_dict = self._extract_by_video_id(svt_id, webpage)

View File

@@ -20,7 +20,7 @@ from ..utils import (
class TV2IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tv2\.no/v/(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://www.tv2.no/v/916509/',
'info_dict': {
'id': '916509',
@@ -33,7 +33,7 @@ class TV2IE(InfoExtractor):
'view_count': int,
'categories': list,
},
}
}]
_API_DOMAIN = 'sumo.tv2.no'
_PROTOCOLS = ('HDS', 'HLS', 'DASH')
_GEO_COUNTRIES = ['NO']
@@ -42,6 +42,12 @@ class TV2IE(InfoExtractor):
video_id = self._match_id(url)
api_base = 'http://%s/api/web/asset/%s' % (self._API_DOMAIN, video_id)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset.get('subtitle') or asset['title']
is_live = asset.get('live') is True
formats = []
format_urls = []
for protocol in self._PROTOCOLS:
@@ -81,7 +87,8 @@ class TV2IE(InfoExtractor):
elif ext == 'm3u8':
if not data.get('drmProtected'):
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
video_url, video_id, 'mp4',
'm3u8' if is_live else 'm3u8_native',
m3u8_id=format_id, fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
@@ -99,11 +106,6 @@ class TV2IE(InfoExtractor):
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset['title']
thumbnails = [{
'id': thumbnail.get('@type'),
'url': thumbnail.get('url'),
@@ -112,7 +114,7 @@ class TV2IE(InfoExtractor):
return {
'id': video_id,
'url': video_url,
'title': title,
'title': self._live_title(title) if is_live else title,
'description': strip_or_none(asset.get('description')),
'thumbnails': thumbnails,
'timestamp': parse_iso8601(asset.get('createTime')),
@@ -120,6 +122,7 @@ class TV2IE(InfoExtractor):
'view_count': int_or_none(asset.get('views')),
'categories': asset.get('keywords', '').split(','),
'formats': formats,
'is_live': is_live,
}
@@ -168,13 +171,13 @@ class TV2ArticleIE(InfoExtractor):
class KatsomoIE(TV2IE):
_VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv)\.fi/(?:#!/)?(?:[^/]+/[0-9a-z-]+-\d+/[0-9a-z-]+-|[^/]+/\d+/[^/]+/)(?P<id>\d+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv(uutiset)?)\.fi/(?:sarja/[0-9a-z-]+-\d+/[0-9a-z-]+-|(?:#!/)?jakso/(?:\d+/[^/]+/)?|video/prog)(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mtv.fi/sarja/mtv-uutiset-live-33001002003/lahden-pelicans-teki-kovan-ratkaisun-ville-nieminen-pihalle-1181321',
'info_dict': {
'id': '1181321',
'ext': 'mp4',
'title': 'MTV Uutiset Live',
'title': 'Lahden Pelicans teki kovan ratkaisun Ville Nieminen pihalle',
'description': 'Päätöksen teki Pelicansin hallitus.',
'timestamp': 1575116484,
'upload_date': '20191130',
@@ -186,7 +189,60 @@ class KatsomoIE(TV2IE):
# m3u8 download
'skip_download': True,
},
}
}, {
'url': 'http://www.katsomo.fi/#!/jakso/33001005/studio55-fi/658521/jukka-kuoppamaki-tekee-yha-lauluja-vaikka-lentokoneessa',
'only_matching': True,
}, {
'url': 'https://www.mtvuutiset.fi/video/prog1311159',
'only_matching': True,
}, {
'url': 'https://www.katsomo.fi/#!/jakso/1311159',
'only_matching': True,
}]
_API_DOMAIN = 'api.katsomo.fi'
_PROTOCOLS = ('HLS', 'MPD')
_GEO_COUNTRIES = ['FI']
class MTVUutisetArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)mtvuutiset\.fi/artikkeli/[^/]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mtvuutiset.fi/artikkeli/tallaisia-vaurioita-viking-amorellassa-on-useamman-osaston-alla-vetta/7931384',
'info_dict': {
'id': '1311159',
'ext': 'mp4',
'title': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'description': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'timestamp': 1600608966,
'upload_date': '20200920',
'duration': 153.7886666,
'view_count': int,
'categories': list,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
# multiple Youtube embeds
'url': 'https://www.mtvuutiset.fi/artikkeli/50-vuotta-subarun-vastaiskua/6070962',
'only_matching': True,
}]
def _real_extract(self, url):
article_id = self._match_id(url)
article = self._download_json(
'http://api.mtvuutiset.fi/mtvuutiset/api/json/' + article_id,
article_id)
def entries():
for video in (article.get('videos') or []):
video_type = video.get('videotype')
video_url = video.get('url')
if not (video_url and video_type in ('katsomo', 'youtube')):
continue
yield self.url_result(
video_url, video_type.capitalize(), video.get('video_id'))
return self.playlist_result(
entries(), article_id, article.get('title'), article.get('description'))

View File

@@ -17,7 +17,7 @@ class TV4IE(InfoExtractor):
tv4\.se/(?:[^/]+)/klipp/(?:.*)-|
tv4play\.se/
(?:
(?:program|barn)/(?:[^/]+/|(?:[^\?]+)\?video_id=)|
(?:program|barn)/(?:(?:[^/]+/){1,2}|(?:[^\?]+)\?video_id=)|
iframe/video/|
film/|
sport/|
@@ -65,6 +65,10 @@ class TV4IE(InfoExtractor):
{
'url': 'http://www.tv4play.se/program/farang/3922081',
'only_matching': True,
},
{
'url': 'https://www.tv4play.se/program/nyheterna/avsnitt/13315940',
'only_matching': True,
}
]

View File

@@ -4,7 +4,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import (
int_or_none,
parse_iso8601,
str_or_none,
strip_or_none,
try_get,
)
class VidioIE(InfoExtractor):
@@ -21,57 +27,63 @@ class VidioIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 149,
'like_count': int,
'uploader': 'TWELVE Pic',
'timestamp': 1444902800,
'upload_date': '20151015',
'uploader_id': 'twelvepictures',
'channel': 'Cover Music Video',
'channel_id': '280236',
'view_count': int,
'dislike_count': int,
'comment_count': int,
'tags': 'count:4',
},
}, {
'url': 'https://www.vidio.com/watch/77949-south-korea-test-fires-missile-that-can-strike-all-of-the-north',
'only_matching': True,
}]
def _real_initialize(self):
self._api_key = self._download_json(
'https://www.vidio.com/auth', None, data=b'')['api_key']
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id, display_id = mobj.group('id', 'display_id')
video_id, display_id = re.match(self._VALID_URL, url).groups()
data = self._download_json(
'https://api.vidio.com/videos/' + video_id, display_id, headers={
'Content-Type': 'application/vnd.api+json',
'X-API-KEY': self._api_key,
})
video = data['videos'][0]
title = video['title'].strip()
webpage = self._download_webpage(url, display_id)
title = self._og_search_title(webpage)
m3u8_url, duration, thumbnail = [None] * 3
clips = self._parse_json(
self._html_search_regex(
r'data-json-clips\s*=\s*(["\'])(?P<data>\[.+?\])\1',
webpage, 'video data', default='[]', group='data'),
display_id, fatal=False)
if clips:
clip = clips[0]
m3u8_url = clip.get('sources', [{}])[0].get('file')
duration = clip.get('clip_duration')
thumbnail = clip.get('image')
m3u8_url = m3u8_url or self._search_regex(
r'data(?:-vjs)?-clip-hls-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'hls url', group='url')
formats = self._extract_m3u8_formats(
m3u8_url, display_id, 'mp4', entry_protocol='m3u8_native')
data['clips'][0]['hls_url'], display_id, 'mp4', 'm3u8_native')
self._sort_formats(formats)
duration = int_or_none(duration or self._search_regex(
r'data-video-duration=(["\'])(?P<duration>\d+)\1', webpage,
'duration', fatal=False, group='duration'))
thumbnail = thumbnail or self._og_search_thumbnail(webpage)
like_count = int_or_none(self._search_regex(
(r'<span[^>]+data-comment-vote-count=["\'](\d+)',
r'<span[^>]+class=["\'].*?\blike(?:__|-)count\b.*?["\'][^>]*>\s*(\d+)'),
webpage, 'like count', fatal=False))
get_first = lambda x: try_get(data, lambda y: y[x + 's'][0], dict) or {}
channel = get_first('channel')
user = get_first('user')
username = user.get('username')
get_count = lambda x: int_or_none(video.get('total_' + x))
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': self._og_search_description(webpage),
'thumbnail': thumbnail,
'duration': duration,
'like_count': like_count,
'description': strip_or_none(video.get('description')),
'thumbnail': video.get('image_url_medium'),
'duration': int_or_none(video.get('duration')),
'like_count': get_count('likes'),
'formats': formats,
'uploader': user.get('name'),
'timestamp': parse_iso8601(video.get('created_at')),
'uploader_id': username,
'uploader_url': 'https://www.vidio.com/@' + username if username else None,
'channel': channel.get('name'),
'channel_id': str_or_none(channel.get('id')),
'view_count': get_count('view_count'),
'dislike_count': get_count('dislikes'),
'comment_count': get_count('comments'),
'tags': video.get('tag_list'),
}

View File

@@ -125,7 +125,7 @@ class VLiveIE(VLiveBaseIE):
headers={'Referer': 'https://www.vlive.tv/'}, query=query)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
self.raise_login_required(json.loads(e.cause.read().decode())['message'])
self.raise_login_required(json.loads(e.cause.read().decode('utf-8'))['message'])
raise
def _real_extract(self, url):

View File

@@ -0,0 +1,62 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
try_get,
)
class VTMIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vtm\.be/([^/?&#]+)~v(?P<id>[0-9a-f]{8}(?:-[0-9a-f]{4}){3}-[0-9a-f]{12})'
_TEST = {
'url': 'https://vtm.be/gast-vernielt-genkse-hotelkamer~ve7534523-279f-4b4d-a5c9-a33ffdbe23e1',
'md5': '37dca85fbc3a33f2de28ceb834b071f8',
'info_dict': {
'id': '192445',
'ext': 'mp4',
'title': 'Gast vernielt Genkse hotelkamer',
'timestamp': 1611060180,
'upload_date': '20210119',
'duration': 74,
# TODO: fix url _type result processing
# 'series': 'Op Interventie',
}
}
def _real_extract(self, url):
uuid = self._match_id(url)
video = self._download_json(
'https://omc4vm23offuhaxx6hekxtzspi.appsync-api.eu-west-1.amazonaws.com/graphql',
uuid, query={
'query': '''{
getComponent(type: Video, uuid: "%s") {
... on Video {
description
duration
myChannelsVideo
program {
title
}
publishedAt
title
}
}
}''' % uuid,
}, headers={
'x-api-key': 'da2-lz2cab4tfnah3mve6wiye4n77e',
})['data']['getComponent']
return {
'_type': 'url',
'id': uuid,
'title': video.get('title'),
'url': 'http://mychannels.video/embed/%d' % video['myChannelsVideo'],
'description': video.get('description'),
'timestamp': parse_iso8601(video.get('publishedAt')),
'duration': int_or_none(video.get('duration')),
'series': try_get(video, lambda x: x['program']['title']),
'ie_key': 'Medialaan',
}

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
ExtractorError,
int_or_none,
@@ -47,6 +48,22 @@ class VVVVIDIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
# video_type == 'video/youtube'
'url': 'https://www.vvvvid.it/show/404/one-punch-man/406/486683/trailer',
'md5': '33e0edfba720ad73a8782157fdebc648',
'info_dict': {
'id': 'RzmFKUDOUgw',
'ext': 'mp4',
'title': 'Trailer',
'upload_date': '20150906',
'description': 'md5:a5e802558d35247fee285875328c0b80',
'uploader_id': 'BandaiVisual',
'uploader': 'BANDAI NAMCO Arts Channel',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.vvvvid.it/show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048',
'only_matching': True
@@ -154,12 +171,13 @@ class VVVVIDIE(InfoExtractor):
if season_number:
info['season_number'] = int(season_number)
for quality in ('_sd', ''):
video_type = video_data.get('video_type')
is_youtube = False
for quality in ('', '_sd'):
embed_code = video_data.get('embed_info' + quality)
if not embed_code:
continue
embed_code = ds(embed_code)
video_type = video_data.get('video_type')
if video_type in ('video/rcs', 'video/kenc'):
if video_type == 'video/kenc':
kenc = self._download_json(
@@ -172,19 +190,28 @@ class VVVVIDIE(InfoExtractor):
if kenc_message:
embed_code += '?' + ds(kenc_message)
formats.extend(self._extract_akamai_formats(embed_code, video_id))
elif video_type == 'video/youtube':
info.update({
'_type': 'url_transparent',
'ie_key': YoutubeIE.ie_key(),
'url': embed_code,
})
is_youtube = True
break
else:
formats.extend(self._extract_wowza_formats(
'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id))
metadata_from_url(embed_code)
self._sort_formats(formats)
if not is_youtube:
self._sort_formats(formats)
info['formats'] = formats
metadata_from_url(video_data.get('thumbnail'))
info.update(self._extract_common_video_info(video_data))
info.update({
'id': video_id,
'title': title,
'formats': formats,
'duration': int_or_none(video_data.get('length')),
'series': video_data.get('show_title'),
'season_id': season_id,

File diff suppressed because it is too large Load Diff

View File

@@ -87,11 +87,16 @@ class ZypeIE(InfoExtractor):
r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1',
body, 'm3u8 url', group='url', default=None)
if not m3u8_url:
source = self._parse_json(self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body,
'source'), video_id, js_to_json)
if source.get('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % source['id']
source = self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body, 'source')
def get_attr(key):
return self._search_regex(
r'\b%s\s*:\s*([\'"])(?P<val>(?:(?!\1).)+)\1' % key,
source, key, group='val')
if get_attr('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % get_attr('id')
formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
text_tracks = self._search_regex(

View File

@@ -16,7 +16,9 @@ from .compat import (
from .utils import (
expand_path,
get_executable_path,
OUTTMPL_TYPES,
preferredencoding,
REMUX_EXTENSIONS,
write_string,
)
from .version import __version__
@@ -828,24 +830,28 @@ def parseOpts(overrideArguments=None):
metavar='TYPE:PATH', dest='paths', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': 'home|temp|config|description|annotation|subtitle|infojson|thumbnail',
'allowed_keys': 'home|temp|%s' % '|'.join(OUTTMPL_TYPES.keys()),
'process': lambda x: x.strip()},
help=(
'The paths where the files should be downloaded. '
'Specify the type of file and the path separated by a colon ":" '
'(supported: description|annotation|subtitle|infojson|thumbnail). '
'Specify the type of file and the path separated by a colon ":". '
'All the same types as --output are supported. '
'Additionally, you can also provide "home" and "temp" paths. '
'All intermediary files are first downloaded to the temp path and '
'then the final files are moved over to the home path after download is finished. '
'Note that this option is ignored if --output is an absolute path'))
'This option is ignored if --output is an absolute path'))
filesystem.add_option(
'-o', '--output',
dest='outtmpl', metavar='TEMPLATE',
metavar='[TYPE:]TEMPLATE', dest='outtmpl', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': '|'.join(OUTTMPL_TYPES.keys()),
'default_key': 'default', 'process': lambda x: x.strip()},
help='Output filename template, see "OUTPUT TEMPLATE" for details')
filesystem.add_option(
'--output-na-placeholder',
dest='outtmpl_na_placeholder', metavar='PLACEHOLDER', default='NA',
help=('Placeholder value for unavailable meta fields in output filename template (default is "%default")'))
dest='outtmpl_na_placeholder', metavar='TEXT', default='NA',
help=('Placeholder value for unavailable meta fields in output filename template (default: "%default")'))
filesystem.add_option(
'--autonumber-size',
dest='autonumber_size', metavar='NUMBER', type=int,
@@ -889,11 +895,13 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'-c', '--continue',
action='store_true', dest='continue_dl', default=True,
help='Resume partially downloaded files (default)')
help='Resume partially downloaded files/fragments (default)')
filesystem.add_option(
'--no-continue',
action='store_false', dest='continue_dl',
help='Restart download of partially downloaded files from beginning')
help=(
'Do not resume partially downloaded fragments. '
'If the file is unfragmented, restart download of the entire file'))
filesystem.add_option(
'--part',
action='store_false', dest='nopart', default=False,
@@ -921,7 +929,7 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'--write-info-json',
action='store_true', dest='writeinfojson', default=False,
help='Write video metadata to a .info.json file')
help='Write video metadata to a .info.json file (this may contain personal information)')
filesystem.add_option(
'--no-write-info-json',
action='store_false', dest='writeinfojson',
@@ -934,6 +942,22 @@ def parseOpts(overrideArguments=None):
'--no-write-annotations',
action='store_false', dest='writeannotations',
help='Do not write video annotations (default)')
filesystem.add_option(
'--write-playlist-metafiles',
action='store_true', dest='allow_playlist_files', default=True,
help=(
'Write playlist metadata in addition to the video metadata '
'when using --write-info-json, --write-description etc. (default)'))
filesystem.add_option(
'--no-write-playlist-metafiles',
action='store_false', dest='allow_playlist_files',
help=(
'Do not write playlist metadata when using '
'--write-info-json, --write-description etc.'))
filesystem.add_option(
'--get-comments',
action='store_true', dest='getcomments', default=False,
help='Retrieve video comments to be placed in the .info.json file')
filesystem.add_option(
'--load-info-json', '--load-info',
dest='load_info_filename', metavar='FILE',
@@ -1001,24 +1025,28 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'-x', '--extract-audio',
action='store_true', dest='extractaudio', default=False,
help='Convert video files to audio-only files (requires ffmpeg/avconv and ffprobe/avprobe)')
help='Convert video files to audio-only files (requires ffmpeg and ffprobe)')
postproc.add_option(
'--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x')
postproc.add_option(
'--audio-quality', metavar='QUALITY',
dest='audioquality', default='5',
help='Specify ffmpeg/avconv audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default %default)')
help='Specify ffmpeg audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default %default)')
postproc.add_option(
'--remux-video',
metavar='FORMAT', dest='remuxvideo', default=None,
help=(
'Remux the video into another container if necessary (currently supported: mp4|mkv). '
'If target container does not support the video/audio codec, remuxing will fail'))
'Remux the video into another container if necessary (currently supported: %s). '
'If target container does not support the video/audio codec, remuxing will fail. '
'You can specify multiple rules; eg. "aac>m4a/mov>mp4/mkv" will remux aac to m4a, mov to mp4 '
'and anything else to mkv.' % '|'.join(REMUX_EXTENSIONS)))
postproc.add_option(
'--recode-video',
metavar='FORMAT', dest='recodevideo', default=None,
help='Re-encode the video into another format if re-encoding is necessary (currently supported: mp4|flv|ogg|webm|mkv|avi)')
help=(
'Re-encode the video into another format if re-encoding is necessary. '
'The supported formats are the same as --remux-video'))
postproc.add_option(
'--postprocessor-args', '--ppa',
metavar='NAME:ARGS', dest='postprocessor_args', default={}, type='str',
@@ -1030,7 +1058,7 @@ def parseOpts(overrideArguments=None):
'to give the argument to the specified postprocessor/executable. Supported postprocessors are: '
'SponSkrub, ExtractAudio, VideoRemuxer, VideoConvertor, EmbedSubtitle, Metadata, Merger, '
'FixupStretched, FixupM4a, FixupM3u8, SubtitlesConvertor and EmbedThumbnail. '
'The supported executables are: SponSkrub, FFmpeg, FFprobe, avconf, avprobe and AtomicParsley. '
'The supported executables are: SponSkrub, FFmpeg, FFprobe, and AtomicParsley. '
'You can use this option multiple times to give different arguments to different postprocessors. '
'You can also specify "PP+EXE:ARGS" to give the arguments to the specified executable '
'only when being used by the specified postprocessor. '
@@ -1078,14 +1106,20 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'--metadata-from-title',
metavar='FORMAT', dest='metafromtitle',
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--parse-metadata',
metavar='FIELD:FORMAT', dest='metafromfield', action='append',
help=(
'Parse additional metadata like song title / artist from the video title. '
'The format syntax is the same as --output. Regular expression with '
'named capture groups may also be used. '
'The parsed parameters replace existing values. '
'Example: --metadata-from-title "%(artist)s - %(title)s" matches a title like '
'Parse additional metadata like title/artist from other fields. '
'Give field name to extract data from, and format of the field seperated by a ":". '
'Either regular expression with named capture groups or a '
'similar syntax to the output template can also be used. '
'The parsed parameters replace any existing values and can be use in output template'
'This option can be used multiple times. '
'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like '
'"Coldplay - Paradise". '
'Example (regex): --metadata-from-title "(?P<artist>.+?) - (?P<title>.+)"'))
'Example (regex): --parse-metadata "description:Artist - (?P<artist>.+?)"'))
postproc.add_option(
'--xattrs',
action='store_true', dest='xattrs', default=False,
@@ -1100,15 +1134,15 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'--prefer-avconv', '--no-prefer-ffmpeg',
action='store_false', dest='prefer_ffmpeg',
help='Prefer avconv over ffmpeg for running the postprocessors (Alias: --no-prefer-ffmpeg)')
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--prefer-ffmpeg', '--no-prefer-avconv',
action='store_true', dest='prefer_ffmpeg',
help='Prefer ffmpeg over avconv for running the postprocessors (default) (Alias: --no-prefer-avconv)')
action='store_true', dest='prefer_ffmpeg', default=True,
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--ffmpeg-location', '--avconv-location', metavar='PATH',
dest='ffmpeg_location',
help='Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory (Alias: --avconv-location)')
help='Location of the ffmpeg binary; either the path to the binary or its containing directory')
postproc.add_option(
'--exec',
metavar='CMD', dest='exec_cmd',
@@ -1217,19 +1251,15 @@ def parseOpts(overrideArguments=None):
return
def read_options(path, user=False):
func = _readUserConf if user else _readOptions
current_path = os.path.join(path, 'yt-dlp.conf')
config = func(current_path, default=None)
if user:
config, current_path = config
if config is None:
current_path = os.path.join(path, 'youtube-dlc.conf')
config = func(current_path, default=None)
for package in ('yt-dlp', 'youtube-dlc'):
if user:
config, current_path = config
if config is None:
return [], None
return config, current_path
config, current_path = _readUserConf(package, default=None)
else:
current_path = os.path.join(path, '%s.conf' % package)
config = _readOptions(current_path, default=None)
if config is not None:
return config, current_path
return [], None
configs['portable'], paths['portable'] = read_options(get_executable_path())
if '--ignore-config' in configs['portable']:

View File

@@ -16,7 +16,8 @@ from .ffmpeg import (
)
from .xattrpp import XAttrMetadataPP
from .execafterdownload import ExecAfterDownloadPP
from .metadatafromtitle import MetadataFromTitlePP
from .metadatafromfield import MetadataFromFieldPP
from .metadatafromfield import MetadataFromTitlePP
from .movefilesafterdownload import MoveFilesAfterDownloadPP
from .sponskrub import SponSkrubPP
@@ -39,6 +40,7 @@ __all__ = [
'FFmpegSubtitlesConvertorPP',
'FFmpegVideoConvertorPP',
'FFmpegVideoRemuxerPP',
'MetadataFromFieldPP',
'MetadataFromTitlePP',
'MoveFilesAfterDownloadPP',
'SponSkrubPP',

View File

@@ -56,7 +56,7 @@ class PostProcessor(object):
def write_debug(self, text, prefix=True, *args, **kwargs):
tag = '[debug] ' if prefix else ''
if self.get_param('verbose', False):
if self.get_param('verbose', False) and self._downloader:
return self._downloader.to_screen('%s%s' % (tag, text), *args, **kwargs)
def get_param(self, name, default=None, *args, **kwargs):

View File

@@ -4,6 +4,15 @@ from __future__ import unicode_literals
import os
import subprocess
import struct
import re
import base64
try:
import mutagen
has_mutagen = True
except ImportError:
has_mutagen = False
from .ffmpeg import FFmpegPostProcessor
@@ -11,11 +20,12 @@ from ..utils import (
check_executable,
encodeArgument,
encodeFilename,
error_to_compat_str,
PostProcessingError,
prepend_extension,
process_communicate_or_kill,
replace_extension,
shell_quote,
process_communicate_or_kill,
)
@@ -37,7 +47,7 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
self.to_screen('There aren\'t any thumbnails to embed')
return [], info
thumbnail_filename = info['thumbnails'][-1]['filename']
original_thumbnail = thumbnail_filename = info['thumbnails'][-1]['filename']
if not os.path.exists(encodeFilename(thumbnail_filename)):
self.report_warning('Skipping embedding the thumbnail because the file is missing.')
@@ -56,7 +66,7 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
self.to_screen('Correcting extension to webp and escaping path for thumbnail "%s"' % thumbnail_filename)
thumbnail_webp_filename = replace_extension(thumbnail_filename, 'webp')
os.rename(encodeFilename(thumbnail_filename), encodeFilename(thumbnail_webp_filename))
thumbnail_filename = thumbnail_webp_filename
original_thumbnail = thumbnail_filename = thumbnail_webp_filename
thumbnail_ext = 'webp'
# Convert unsupported thumbnail formats to JPEG (see #25687, #25717)
@@ -68,11 +78,12 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
escaped_thumbnail_jpg_filename = replace_extension(escaped_thumbnail_filename, 'jpg')
self.to_screen('Converting thumbnail "%s" to JPEG' % escaped_thumbnail_filename)
self.run_ffmpeg(escaped_thumbnail_filename, escaped_thumbnail_jpg_filename, ['-bsf:v', 'mjpeg2jpeg'])
os.remove(encodeFilename(escaped_thumbnail_filename))
thumbnail_jpg_filename = replace_extension(thumbnail_filename, 'jpg')
# Rename back to unescaped for further processing
os.rename(encodeFilename(escaped_thumbnail_filename), encodeFilename(thumbnail_filename))
os.rename(encodeFilename(escaped_thumbnail_jpg_filename), encodeFilename(thumbnail_jpg_filename))
thumbnail_filename = thumbnail_jpg_filename
thumbnail_ext = 'jpg'
success = True
if info['ext'] == 'mp3':
@@ -83,47 +94,100 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename, thumbnail_filename], temp_filename, options)
elif info['ext'] == 'mkv':
options = [
'-c', 'copy', '-map', '0', '-dn', '-attach', thumbnail_filename,
'-metadata:s:t', 'mimetype=image/jpeg', '-metadata:s:t', 'filename=cover.jpg']
elif info['ext'] in ['mkv', 'mka']:
options = ['-c', 'copy', '-map', '0', '-dn']
mimetype = 'image/%s' % ('png' if thumbnail_ext == 'png' else 'jpeg')
old_stream, new_stream = self.get_stream_number(
filename, ('tags', 'mimetype'), mimetype)
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend([
'-attach', thumbnail_filename,
'-metadata:s:%d' % new_stream, 'mimetype=%s' % mimetype,
'-metadata:s:%d' % new_stream, 'filename=cover.%s' % thumbnail_ext])
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename], temp_filename, options)
self.run_ffmpeg(filename, temp_filename, options)
elif info['ext'] in ['m4a', 'mp4']:
if not check_executable('AtomicParsley', ['-v']):
raise EmbedThumbnailPPError('AtomicParsley was not found. Please install.')
elif info['ext'] in ['m4a', 'mp4', 'mov']:
try:
options = ['-c', 'copy', '-map', '0', '-dn', '-map', '1']
cmd = [encodeFilename('AtomicParsley', True),
encodeFilename(filename, True),
encodeArgument('--artwork'),
encodeFilename(thumbnail_filename, True),
encodeArgument('-o'),
encodeFilename(temp_filename, True)]
cmd += [encodeArgument(o) for o in self._configuration_args(exe='AtomicParsley')]
old_stream, new_stream = self.get_stream_number(
filename, ('disposition', 'attached_pic'), 1)
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend(['-disposition:%s' % new_stream, 'attached_pic'])
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename, thumbnail_filename], temp_filename, options)
except PostProcessingError as err:
self.report_warning('unable to embed using ffprobe & ffmpeg; %s' % error_to_compat_str(err))
if not check_executable('AtomicParsley', ['-v']):
raise EmbedThumbnailPPError('AtomicParsley was not found. Please install.')
cmd = [encodeFilename('AtomicParsley', True),
encodeFilename(filename, True),
encodeArgument('--artwork'),
encodeFilename(thumbnail_filename, True),
encodeArgument('-o'),
encodeFilename(temp_filename, True)]
cmd += [encodeArgument(o) for o in self._configuration_args(exe='AtomicParsley')]
self.to_screen('Adding thumbnail to "%s"' % filename)
self.write_debug('AtomicParsley command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
if p.returncode != 0:
msg = stderr.decode('utf-8', 'replace').strip()
raise EmbedThumbnailPPError(msg)
# for formats that don't support thumbnails (like 3gp) AtomicParsley
# won't create to the temporary file
if b'No changes' in stdout:
self.report_warning('The file format doesn\'t support embedding a thumbnail')
success = False
elif info['ext'] in ['ogg', 'opus']:
if not has_mutagen:
raise EmbedThumbnailPPError('module mutagen was not found. Please install using `python -m pip install mutagen`')
self.to_screen('Adding thumbnail to "%s"' % filename)
self.write_debug('AtomicParsley command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
size_regex = r',\s*(?P<w>\d+)x(?P<h>\d+)\s*[,\[]'
size_result = self.run_ffmpeg(thumbnail_filename, thumbnail_filename, ['-hide_banner'])
mobj = re.search(size_regex, size_result)
width, height = int(mobj.group('w')), int(mobj.group('h'))
mimetype = ('image/%s' % ('png' if thumbnail_ext == 'png' else 'jpeg')).encode('ascii')
if p.returncode != 0:
msg = stderr.decode('utf-8', 'replace').strip()
raise EmbedThumbnailPPError(msg)
# for formats that don't support thumbnails (like 3gp) AtomicParsley
# won't create to the temporary file
if b'No changes' in stdout:
self.report_warning('The file format doesn\'t support embedding a thumbnail')
success = False
# https://xiph.org/flac/format.html#metadata_block_picture
data = bytearray()
data += struct.pack('>II', 3, len(mimetype))
data += mimetype
data += struct.pack('>IIIIII', 0, width, height, 8, 0, os.stat(thumbnail_filename).st_size) # 32 if png else 24
fin = open(thumbnail_filename, "rb")
data += fin.read()
fin.close()
temp_filename = filename
f = mutagen.File(temp_filename)
f.tags['METADATA_BLOCK_PICTURE'] = base64.b64encode(data).decode('ascii')
f.save()
else:
raise EmbedThumbnailPPError('Only mp3, mkv, m4a and mp4 are supported for thumbnail embedding for now.')
raise EmbedThumbnailPPError('Supported filetypes for thumbnail embedding are: mp3, mkv/mka, ogg/opus, m4a/mp4/mov')
if success:
if success and temp_filename != filename:
os.remove(encodeFilename(filename))
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
files_to_delete = [] if self._already_have_thumbnail else [thumbnail_filename]
files_to_delete = [thumbnail_filename]
if self._already_have_thumbnail:
info['__files_to_move'][original_thumbnail] = replace_extension(
info['__thumbnail_filename'], os.path.splitext(original_thumbnail)[1][1:])
if original_thumbnail == thumbnail_filename:
files_to_delete = []
return files_to_delete, info

View File

@@ -5,6 +5,7 @@ import os
import subprocess
import time
import re
import json
from .common import AudioConversionError, PostProcessor
@@ -20,8 +21,9 @@ from ..utils import (
subtitles_filename,
dfxp2srt,
ISO639Utils,
replace_extension,
process_communicate_or_kill,
replace_extension,
traverse_dict,
)
@@ -59,7 +61,7 @@ class FFmpegPostProcessor(PostProcessor):
def check_version(self):
if not self.available:
raise FFmpegPostProcessorError('ffmpeg or avconv not found. Please install one.')
raise FFmpegPostProcessorError('ffmpeg not found. Please install')
required_version = '10-0' if self.basename == 'avconv' else '1.0'
if is_outdated_version(
@@ -102,7 +104,7 @@ class FFmpegPostProcessor(PostProcessor):
if not os.path.exists(location):
self.report_warning(
'ffmpeg-location %s does not exist! '
'Continuing without avconv/ffmpeg.' % (location))
'Continuing without ffmpeg.' % (location))
self._versions = {}
return
elif not os.path.isdir(location):
@@ -110,7 +112,7 @@ class FFmpegPostProcessor(PostProcessor):
if basename not in programs:
self.report_warning(
'Cannot identify executable %s, its basename should be one of %s. '
'Continuing without avconv/ffmpeg.' %
'Continuing without ffmpeg.' %
(location, ', '.join(programs)))
self._versions = {}
return None
@@ -163,7 +165,7 @@ class FFmpegPostProcessor(PostProcessor):
def get_audio_codec(self, path):
if not self.probe_available and not self.available:
raise PostProcessingError('ffprobe/avprobe and ffmpeg/avconv not found. Please install one.')
raise PostProcessingError('ffprobe and ffmpeg not found. Please install')
try:
if self.probe_available:
cmd = [
@@ -201,6 +203,37 @@ class FFmpegPostProcessor(PostProcessor):
return mobj.group(1)
return None
def get_metadata_object(self, path, opts=[]):
if self.probe_basename != 'ffprobe':
if self.probe_available:
self.report_warning('Only ffprobe is supported for metadata extraction')
raise PostProcessingError('ffprobe not found. Please install.')
self.check_version()
cmd = [
encodeFilename(self.probe_executable, True),
encodeArgument('-hide_banner'),
encodeArgument('-show_format'),
encodeArgument('-show_streams'),
encodeArgument('-print_format'),
encodeArgument('json'),
]
cmd += opts
cmd.append(encodeFilename(self._ffmpeg_filename_argument(path), True))
if self._downloader.params.get('verbose', False):
self._downloader.to_screen('[debug] ffprobe command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
stdout, stderr = p.communicate()
return json.loads(stdout.decode('utf-8', 'replace'))
def get_stream_number(self, path, keys, value):
streams = self.get_metadata_object(path)['streams']
num = next(
(i for i, stream in enumerate(streams) if traverse_dict(stream, keys, casesense=False) == value),
None)
return num, len(streams)
def run_ffmpeg_multiple_files(self, input_paths, out_path, opts):
self.check_version()
@@ -227,19 +260,23 @@ class FFmpegPostProcessor(PostProcessor):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
if p.returncode != 0:
stderr = stderr.decode('utf-8', 'replace')
msg = stderr.strip().split('\n')[-1]
raise FFmpegPostProcessorError(msg)
stderr = stderr.decode('utf-8', 'replace').strip()
if self._downloader.params.get('verbose', False):
self.report_error(stderr)
raise FFmpegPostProcessorError(stderr.split('\n')[-1])
self.try_utime(out_path, oldest_mtime, oldest_mtime)
return stderr.decode('utf-8', 'replace')
def run_ffmpeg(self, path, out_path, opts):
self.run_ffmpeg_multiple_files([path], out_path, opts)
return self.run_ffmpeg_multiple_files([path], out_path, opts)
def _ffmpeg_filename_argument(self, fn):
# Always use 'file:' because the filename may contain ':' (ffmpeg
# interprets that as a protocol) or can start with '-' (-- is broken in
# ffmpeg, see https://ffmpeg.org/trac/ffmpeg/ticket/2127 for details)
# Also leave '-' intact in order not to break streaming to stdout.
if fn.startswith(('http://', 'https://')):
return fn
return 'file:' + fn if fn != '-' else fn
@@ -349,21 +386,35 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
class FFmpegVideoRemuxerPP(FFmpegPostProcessor):
def __init__(self, downloader=None, preferedformat=None):
super(FFmpegVideoRemuxerPP, self).__init__(downloader)
self._preferedformat = preferedformat
self._preferedformats = preferedformat.lower().split('/')
def run(self, information):
path = information['filepath']
if information['ext'] == self._preferedformat:
self.to_screen('Not remuxing video file %s - already is in target format %s' % (path, self._preferedformat))
sourceext, targetext = information['ext'].lower(), None
for pair in self._preferedformats:
kv = pair.split('>')
if len(kv) == 1 or kv[0].strip() == sourceext:
targetext = kv[-1].strip()
break
_skip_msg = (
'could not find a mapping for %s' if not targetext
else 'already is in target format %s' if sourceext == targetext
else None)
if _skip_msg:
self.to_screen('Not remuxing media file %s; %s' % (path, _skip_msg % sourceext))
return [], information
options = ['-c', 'copy', '-map', '0', '-dn']
prefix, sep, ext = path.rpartition('.')
outpath = prefix + sep + self._preferedformat
self.to_screen('Remuxing video from %s to %s, Destination: ' % (information['ext'], self._preferedformat) + outpath)
if targetext in ['mp4', 'm4a', 'mov']:
options.extend(['-movflags', '+faststart'])
prefix, sep, oldext = path.rpartition('.')
outpath = prefix + sep + targetext
self.to_screen('Remuxing video from %s to %s; Destination: %s' % (sourceext, targetext, outpath))
self.run_ffmpeg(path, outpath, options)
information['filepath'] = outpath
information['format'] = self._preferedformat
information['ext'] = self._preferedformat
information['format'] = targetext
information['ext'] = targetext
return [path], information
@@ -391,6 +442,10 @@ class FFmpegVideoConvertorPP(FFmpegPostProcessor):
class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
def __init__(self, downloader=None, already_have_subtitle=False):
super(FFmpegEmbedSubtitlePP, self).__init__(downloader)
self._already_have_subtitle = already_have_subtitle
def run(self, information):
if information['ext'] not in ('mp4', 'webm', 'mkv'):
self.to_screen('Subtitles can only be embedded in mp4, webm or mkv files')
@@ -406,18 +461,22 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
sub_langs = []
sub_filenames = []
webm_vtt_warn = False
mp4_ass_warn = False
for lang, sub_info in subtitles.items():
sub_ext = sub_info['ext']
if sub_ext == 'json':
self.to_screen('JSON subtitles cannot be embedded')
self.report_warning('JSON subtitles cannot be embedded')
elif ext != 'webm' or ext == 'webm' and sub_ext == 'vtt':
sub_langs.append(lang)
sub_filenames.append(subtitles_filename(filename, lang, sub_ext, ext))
else:
if not webm_vtt_warn and ext == 'webm' and sub_ext != 'vtt':
webm_vtt_warn = True
self.to_screen('Only WebVTT subtitles can be embedded in webm files')
self.report_warning('Only WebVTT subtitles can be embedded in webm files')
if not mp4_ass_warn and ext == 'mp4' and sub_ext == 'ass':
mp4_ass_warn = True
self.report_warning('ASS subtitles cannot be properly embedded in mp4 files; expect issues')
if not sub_langs:
return [], information
@@ -441,12 +500,13 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
opts.extend(['-metadata:s:s:%d' % i, 'language=%s' % lang_code])
temp_filename = prepend_extension(filename, 'temp')
self.to_screen('Embedding subtitles in \'%s\'' % filename)
self.to_screen('Embedding subtitles in "%s"' % filename)
self.run_ffmpeg_multiple_files(input_files, temp_filename, opts)
os.remove(encodeFilename(filename))
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
return sub_filenames, information
files_to_delete = [] if self._already_have_subtitle else sub_filenames
return files_to_delete, information
class FFmpegMetadataPP(FFmpegPostProcessor):
@@ -471,7 +531,6 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
# 1. https://kdenlive.org/en/project/adding-meta-data-to-mp4-video/
# 2. https://wiki.multimedia.cx/index.php/FFmpeg_Metadata
# 3. https://kodi.wiki/view/Video_file_tagging
# 4. http://atomicparsley.sourceforge.net/mpeg-4files.html
add('title', ('track', 'title'))
add('date', 'upload_date')
@@ -524,6 +583,18 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
in_filenames.append(metadata_filename)
options.extend(['-map_metadata', '1'])
if '__infojson_filename' in info and info['ext'] in ('mkv', 'mka'):
old_stream, new_stream = self.get_stream_number(
filename, ('tags', 'mimetype'), 'application/json')
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend([
'-attach', info['__infojson_filename'],
'-metadata:s:%d' % new_stream, 'mimetype=application/json'
])
self.to_screen('Adding metadata to \'%s\'' % filename)
self.run_ffmpeg_multiple_files(in_filenames, temp_filename, options)
if chapters:

View File

@@ -0,0 +1,71 @@
from __future__ import unicode_literals
import re
from .common import PostProcessor
from ..compat import compat_str
from ..utils import str_or_none
class MetadataFromFieldPP(PostProcessor):
regex = r'(?P<field>\w+):(?P<format>.+)$'
def __init__(self, downloader, formats):
PostProcessor.__init__(self, downloader)
assert isinstance(formats, (list, tuple))
self._data = []
for f in formats:
assert isinstance(f, compat_str)
match = re.match(self.regex, f)
assert match is not None
self._data.append({
'field': match.group('field'),
'format': match.group('format'),
'regex': self.format_to_regex(match.group('format'))})
def format_to_regex(self, fmt):
r"""
Converts a string like
'%(title)s - %(artist)s'
to a regex like
'(?P<title>.+)\ \-\ (?P<artist>.+)'
"""
if not re.search(r'%\(\w+\)s', fmt):
return fmt
lastpos = 0
regex = ''
# replace %(..)s with regex group and escape other string parts
for match in re.finditer(r'%\((\w+)\)s', fmt):
regex += re.escape(fmt[lastpos:match.start()])
regex += r'(?P<' + match.group(1) + r'>[^\r\n]+)'
lastpos = match.end()
if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:])
return regex
def run(self, info):
for dictn in self._data:
field, regex = dictn['field'], dictn['regex']
if field not in info:
self.report_warning('Video doesnot have a %s' % field)
continue
data_to_parse = str_or_none(info[field])
if data_to_parse is None:
self.report_warning('Field %s cannot be parsed' % field)
continue
self.write_debug('Searching for r"%s" in %s' % (regex, field))
match = re.search(regex, data_to_parse)
if match is None:
self.report_warning('Could not interpret video %s as "%s"' % (field, dictn['format']))
continue
for attribute, value in match.groupdict().items():
info[attribute] = value
self.to_screen('parsed %s from %s: %s' % (attribute, field, value if value is not None else 'NA'))
return [], info
class MetadataFromTitlePP(MetadataFromFieldPP): # for backward compatibility
def __init__(self, downloader, titleformat):
super(MetadataFromTitlePP, self).__init__(downloader, ['title:%s' % titleformat])
self._titleformat = titleformat
self._titleregex = self._data[0]['regex']

View File

@@ -1,44 +0,0 @@
from __future__ import unicode_literals
import re
from .common import PostProcessor
class MetadataFromTitlePP(PostProcessor):
def __init__(self, downloader, titleformat):
super(MetadataFromTitlePP, self).__init__(downloader)
self._titleformat = titleformat
self._titleregex = (self.format_to_regex(titleformat)
if re.search(r'%\(\w+\)s', titleformat)
else titleformat)
def format_to_regex(self, fmt):
r"""
Converts a string like
'%(title)s - %(artist)s'
to a regex like
'(?P<title>.+)\ \-\ (?P<artist>.+)'
"""
lastpos = 0
regex = ''
# replace %(..)s with regex group and escape other string parts
for match in re.finditer(r'%\((\w+)\)s', fmt):
regex += re.escape(fmt[lastpos:match.start()])
regex += r'(?P<' + match.group(1) + '>.+)'
lastpos = match.end()
if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:])
return regex
def run(self, info):
title = info['title']
match = re.match(self._titleregex, title)
if match is None:
self.to_screen('Could not interpret title of video as "%s"' % self._titleformat)
return [], info
for attribute, value in match.groupdict().items():
info[attribute] = value
self.to_screen('parsed %s: %s' % (attribute, value if value is not None else 'NA'))
return [], info

View File

@@ -4,11 +4,11 @@ import shutil
from .common import PostProcessor
from ..utils import (
decodeFilename,
encodeFilename,
make_dir,
PostProcessingError,
)
from ..compat import compat_str
class MoveFilesAfterDownloadPP(PostProcessor):
@@ -25,21 +25,22 @@ class MoveFilesAfterDownloadPP(PostProcessor):
dl_path, dl_name = os.path.split(encodeFilename(info['filepath']))
finaldir = info.get('__finaldir', dl_path)
finalpath = os.path.join(finaldir, dl_name)
self.files_to_move[info['filepath']] = finalpath
self.files_to_move.update(info['__files_to_move'])
self.files_to_move[info['filepath']] = decodeFilename(finalpath)
make_newfilename = lambda old: decodeFilename(os.path.join(finaldir, os.path.basename(encodeFilename(old))))
for oldfile, newfile in self.files_to_move.items():
if not newfile:
newfile = make_newfilename(oldfile)
if os.path.abspath(encodeFilename(oldfile)) == os.path.abspath(encodeFilename(newfile)):
continue
if not os.path.exists(encodeFilename(oldfile)):
self.report_warning('File "%s" cannot be found' % oldfile)
continue
if not newfile:
newfile = os.path.join(finaldir, os.path.basename(encodeFilename(oldfile)))
oldfile, newfile = compat_str(oldfile), compat_str(newfile)
if os.path.abspath(encodeFilename(oldfile)) == os.path.abspath(encodeFilename(newfile)):
continue
if os.path.exists(encodeFilename(newfile)):
if self.get_param('overwrites', True):
self.report_warning('Replacing existing file "%s"' % newfile)
os.path.remove(encodeFilename(newfile))
os.remove(encodeFilename(newfile))
else:
self.report_warning(
'Cannot move file "%s" out of temporary directory since "%s" already exists. '
@@ -49,5 +50,5 @@ class MoveFilesAfterDownloadPP(PostProcessor):
self.to_screen('Moving file "%s" to "%s"' % (oldfile, newfile))
shutil.move(oldfile, newfile) # os.rename cannot move between volumes
info['filepath'] = compat_str(finalpath)
info['filepath'] = finalpath
return [], info

View File

@@ -43,6 +43,10 @@ class SponSkrubPP(PostProcessor):
if self.path is None:
return [], information
filename = information['filepath']
if not os.path.exists(encodeFilename(filename)): # no download
return [], information
if information['extractor_key'].lower() != 'youtube':
self.to_screen('Skipping sponskrub since it is not a YouTube video')
return [], information
@@ -58,7 +62,6 @@ class SponSkrubPP(PostProcessor):
if not information.get('__real_download', False):
self.report_warning('If sponskrub is run multiple times, unintended parts of the video could be cut out.')
filename = information['filepath']
temp_filename = prepend_extension(filename, self._temp_ext)
if os.path.exists(encodeFilename(temp_filename)):
os.remove(encodeFilename(temp_filename))

View File

@@ -15,6 +15,7 @@ from .utils import encode_compat_str
from .version import __version__
''' # Not signed
def rsa_verify(message, signature, key):
from hashlib import sha256
assert isinstance(message, bytes)
@@ -27,17 +28,13 @@ def rsa_verify(message, signature, key):
return False
expected = b'0001' + (byte_size - len(asn1) // 2 - 3) * b'ff' + b'00' + asn1
return expected == signature
'''
def update_self(to_screen, verbose, opener):
"""Update the program file with the latest version from the repository"""
return to_screen('Update is currently broken.\nVisit https://github.com/pukkandan/yt-dlp/releases/latest to get the latest version')
UPDATE_URL = 'https://blackjack4494.github.io//update/'
VERSION_URL = UPDATE_URL + 'LATEST_VERSION'
JSON_URL = UPDATE_URL + 'versions.json'
UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
JSON_URL = 'https://api.github.com/repos/pukkandan/yt-dlp/releases/latest'
def sha256sum():
h = hashlib.sha256()
@@ -54,55 +51,36 @@ def update_self(to_screen, verbose, opener):
to_screen('It looks like you installed youtube-dlc with a package manager, pip, setup.py or a tarball. Please use that to update.')
return
# compiled file.exe can find itself by
# to_screen(os.path.basename(sys.executable))
# and path to py or exe
# to_screen(os.path.realpath(sys.executable))
# Check if there is a new version
try:
newversion = opener.open(VERSION_URL).read().decode('utf-8').strip()
except Exception:
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: can\'t find the current version. Please try again later.')
to_screen('Visit https://github.com/blackjack4494/yt-dlc/releases/latest')
return
if newversion == __version__:
to_screen('youtube-dlc is up-to-date (' + __version__ + ')')
return
# Download and check versions info
try:
versions_info = opener.open(JSON_URL).read().decode('utf-8')
versions_info = json.loads(versions_info)
version_info = opener.open(JSON_URL).read().decode('utf-8')
version_info = json.loads(version_info)
except Exception:
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: can\'t obtain versions info. Please try again later.')
to_screen('Visit https://github.com/blackjack4494/yt-dlc/releases/latest')
return
if 'signature' not in versions_info:
to_screen('ERROR: the versions file is not signed or corrupted. Aborting.')
return
signature = versions_info['signature']
del versions_info['signature']
if not rsa_verify(json.dumps(versions_info, sort_keys=True).encode('utf-8'), signature, UPDATES_RSA_KEY):
to_screen('ERROR: the versions file signature is invalid. Aborting.')
to_screen('Visit https://github.com/pukkandan/yt-dlp/releases/lastest')
return
version_id = versions_info['latest']
version_id = version_info['tag_name']
if version_id == __version__:
to_screen('youtube-dlc is up-to-date (' + __version__ + ')')
return
def version_tuple(version_str):
return tuple(map(int, version_str.split('.')))
if version_tuple(__version__) >= version_tuple(version_id):
to_screen('youtube-dlc is up to date (%s)' % __version__)
return
to_screen('Updating to version ' + version_id + ' ...')
version = versions_info['versions'][version_id]
print_notes(to_screen, versions_info['versions'])
version = {
'bin': next(i for i in version_info['assets'] if i['name'] == 'youtube-dlc'),
'exe': next(i for i in version_info['assets'] if i['name'] == 'youtube-dlc.exe'),
'exe_x86': next(i for i in version_info['assets'] if i['name'] == 'youtube-dlc_x86.exe'),
}
# sys.executable is set to the full pathname of the exe-file for py2exe
# though symlinks are not followed so that we need to do this manually
@@ -113,7 +91,7 @@ def update_self(to_screen, verbose, opener):
to_screen('ERROR: no write permissions on %s' % filename)
return
# Py2EXE
# PyInstaller
if hasattr(sys, 'frozen'):
exe = filename
directory = os.path.dirname(exe)
@@ -122,19 +100,14 @@ def update_self(to_screen, verbose, opener):
return
try:
urlh = opener.open(version['exe'][0])
urlh = opener.open(version['exe']['browser_download_url'])
newcontent = urlh.read()
urlh.close()
except (IOError, OSError):
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: unable to download latest version')
to_screen('Visit https://github.com/blackjack4494/yt-dlc/releases/latest')
return
newcontent_hash = hashlib.sha256(newcontent).hexdigest()
if newcontent_hash != version['exe'][1]:
to_screen('ERROR: the downloaded file hash does not match. Aborting.')
to_screen('Visit https://github.com/pukkandan/yt-dlp/releases/lastest')
return
try:
@@ -147,16 +120,17 @@ def update_self(to_screen, verbose, opener):
return
try:
bat = os.path.join(directory, 'youtube-dlc-updater.bat')
bat = os.path.join(directory, 'yt-dlp-updater.cmd')
with io.open(bat, 'w') as batfile:
batfile.write('''
@echo off
echo Waiting for file handle to be closed ...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s" > NUL
echo Updated youtube-dlc to version %s.
start /b "" cmd /c del "%%~f0"&exit /b"
\n''' % (exe, exe, version_id))
@(
echo.Waiting for file handle to be closed ...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s" > NUL
echo.Updated youtube-dlc to version %s.
)
@start /b "" cmd /c del "%%~f0"&exit /b
''' % (exe, exe, version_id))
subprocess.Popen([bat]) # Continues to run in the background
return # Do not show premature success messages
@@ -169,19 +143,14 @@ start /b "" cmd /c del "%%~f0"&exit /b"
# Zip unix package
elif isinstance(globals().get('__loader__'), zipimporter):
try:
urlh = opener.open(version['bin'][0])
urlh = opener.open(version['bin']['browser_download_url'])
newcontent = urlh.read()
urlh.close()
except (IOError, OSError):
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: unable to download latest version')
to_screen('Visit https://github.com/blackjack4494/yt-dlc/releases/latest')
return
newcontent_hash = hashlib.sha256(newcontent).hexdigest()
if newcontent_hash != version['bin'][1]:
to_screen('ERROR: the downloaded file hash does not match. Aborting.')
to_screen('Visit https://github.com/pukkandan/yt-dlp/releases/lastest')
return
try:
@@ -196,6 +165,7 @@ start /b "" cmd /c del "%%~f0"&exit /b"
to_screen('Updated youtube-dlc. Restart youtube-dlc to use the new version.')
''' # UNUSED
def get_notes(versions, fromVersion):
notes = []
for v, vdata in sorted(versions.items()):
@@ -210,3 +180,4 @@ def print_notes(to_screen, versions, fromVersion=__version__):
to_screen('PLEASE NOTE:')
for note in notes:
to_screen(note)
'''

View File

@@ -50,6 +50,7 @@ from .compat import (
compat_html_entities_html5,
compat_http_client,
compat_integer_types,
compat_numeric_types,
compat_kwargs,
compat_os_name,
compat_parse_qs,
@@ -1714,6 +1715,8 @@ KNOWN_EXTENSIONS = (
'wav',
'f4f', 'f4m', 'm3u8', 'smil')
REMUX_EXTENSIONS = ('mp4', 'mkv', 'flv', 'webm', 'mov', 'avi', 'mp3', 'mka', 'm4a', 'ogg', 'opus')
# needed for sanitizing filenames in restricted mode
ACCENT_CHARS = dict(zip('ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ',
itertools.chain('AAAAAA', ['AE'], 'CEEEEIIIIDNOOOOOOO', ['OE'], 'UUUUUY', ['TH', 'ss'],
@@ -3673,6 +3676,18 @@ def url_or_none(url):
return url if re.match(r'^(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None
def strftime_or_none(timestamp, date_format, default=None):
datetime_object = None
try:
if isinstance(timestamp, compat_numeric_types): # unix timestamp
datetime_object = datetime.datetime.utcfromtimestamp(timestamp)
elif isinstance(timestamp, compat_str): # assume YYYYMMDD
datetime_object = datetime.datetime.strptime(timestamp, '%Y%m%d')
return datetime_object.strftime(date_format)
except (ValueError, TypeError, AttributeError):
return default
def parse_duration(s):
if not isinstance(s, compat_basestring):
return None
@@ -4156,7 +4171,18 @@ def qualities(quality_ids):
return q
DEFAULT_OUTTMPL = '%(title)s [%(id)s].%(ext)s'
DEFAULT_OUTTMPL = {
'default': '%(title)s [%(id)s].%(ext)s',
}
OUTTMPL_TYPES = {
'subtitle': None,
'thumbnail': None,
'description': 'description',
'annotation': 'annotations.xml',
'infojson': 'info.json',
'pl_description': 'description',
'pl_infojson': 'info.json',
}
def limit_length(s, length):
@@ -4666,9 +4692,7 @@ def cli_configuration_args(params, arg_name, key, default=[], exe=None): # retu
return default, False
assert isinstance(argdict, dict)
assert isinstance(key, compat_str)
key = key.lower()
args = exe_args = None
if exe is not None:
assert isinstance(exe, compat_str)
@@ -5934,3 +5958,14 @@ def load_plugins(name, type, namespace):
if plugin_info[0] is not None:
plugin_info[0].close()
return classes
def traverse_dict(dictn, keys, casesense=True):
if not isinstance(dictn, dict):
return None
first_key = keys[0]
if not casesense:
dictn = {key.lower(): val for key, val in dictn.items()}
first_key = first_key.lower()
value = dictn.get(first_key, None)
return value if len(keys) < 2 else traverse_dict(value, keys[1:], casesense)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2021.01.20'
__version__ = '2021.02.04'