Commit Graph

456 Commits

Author SHA1 Message Date
rlaphoenix 97efb59e5f Only decode text direction entities in Sub files (cont.)
Already did this for HLS, but somehow forgot to for DASH and direct URLs.
2024-02-29 22:06:57 +00:00
rlaphoenix 4073cefc74 Remove Subtitle.remove_multi_lang_srt_header()
The root cause of the error which required calling this function was identified and fixed in this release.
2024-02-29 22:06:02 +00:00
Arias800 75641bc8ee
Add default shaka-packager build name (#74)
If the user build Shaka-packager manually, the default name will be “packager”.
Adding it to the list will ensure that Devine detects the app in this situation.
2024-02-27 22:48:54 +00:00
rlaphoenix 0c20160ddc Implement `__add__` to Tracks class 2024-02-20 22:06:39 +00:00
rlaphoenix eef397f2e8 HLS: Don't include map data if discontinuity/end of playlist was decrypted
The decrypt() call just before it would have included the map data for us, as it was needed to decrypt. Therefore, it would not need to be added again when merge_discontinuity() is called. In some cases re-adding the map data can cause playback or final merge failure.
2024-02-20 20:12:09 +00:00
rlaphoenix b829ea5c5e DASH: Detect SDH subtitles via AudioPurposeCS:2007=2 2024-02-20 19:29:21 +00:00
rlaphoenix 7f898cf2df HLS: Fix map data exists check when merging segments
`map_data` may resolve Truthy, while `map_data[1]` itself could be None, resulting in `None` being written to the stream.
2024-02-20 02:14:58 +00:00
rlaphoenix 2635d06d58 Set stop event & mark track failed if new HLS DRM fails to license 2024-02-20 01:46:47 +00:00
rlaphoenix 8de3a95c6b Flush file buffers when merging DASH or HLS segments 2024-02-20 01:35:58 +00:00
rlaphoenix 1259a26b14 Create and use new utility to get file extension from URLs/Paths
Fixes #73
2024-02-19 18:14:50 +00:00
rlaphoenix c826a702ab DASH: Fix URL concatenation in some edge cases
In some of the urljoin()'s it would end with `/None`, e.g., `http://.../some_base_value/None`, when it should just join with the base value only.
2024-02-19 17:45:40 +00:00
rlaphoenix 1b76e8ee28 Aria2c: Fix shutdown condition edge condition when URLs > 1000
`stopped_downloads` is capped to just 1000 objects even though I asked for 999999 downloads, so if aria2c is downloading more than 1000 URLs the count of stopped downloads will never match the count of download URLs and never stop.
2024-02-17 23:33:52 +00:00
rlaphoenix d65d29efa3 Remove unnecessary LANGUAGE_MUX_MAP
This language tag/code mapping table is no longer needed as of MKVToolNix v67, which has been the minimum supported version for some time now already.
2024-02-17 23:19:07 +00:00
rlaphoenix 81dca063fa Consolidate typing of Requests/MozillaCookieJar typing to CookieJar 2024-02-16 21:02:06 +00:00
rlaphoenix 9e0515609f HLS: Ignore possible folders when doing naive final merge 2024-02-16 18:41:05 +00:00
rlaphoenix 323577a5fd HLS: Update first segment of EXT-X-KEY state data on discontinuity 2024-02-16 18:21:21 +00:00
rlaphoenix e26e55caf3 HLS: Don't reset EXT-X-KEY state data on discontinuity 2024-02-16 16:50:12 +00:00
rlaphoenix 506ba0f615 HLS: Only merge relevant segments on discontinuity 2024-02-16 16:49:42 +00:00
rlaphoenix 2388c85894 HLS: Ensure all segments to decrypt in range exist 2024-02-16 16:49:13 +00:00
rlaphoenix 7587243aa2 HLS: Don't decrypt on key change if there were no prior segments 2024-02-16 16:48:38 +00:00
rlaphoenix 6a37fe9d1b HLS: Don't merge on discontinuity, if it's the first segment
How the m3u8 parser handles/groups #EXT-X to segment objects means the #EXT-X-DISCONTINUITY (`discontinuity` property) is tied to whatever segment is below it's line. Therefore, there's never a scenario where we need to merge+decrypt and the first every segment of the for loop, as there's no segments before it.

This can happen from just slightly off-spec playlists (can't blame it) but also from the OnSegmentFilter filtering out all segments before the first EXT-X-DISCONTINUITY. Common to happen when filtering out bumpers/intros.
2024-02-16 00:15:36 +00:00
rlaphoenix eac5ed5b61 Aria2c: Fix completed progress information
For some reason aria2c has like 700 internal "download" structs per actual URL it was downloading, probably something to do with multiple connections/split, don't know don't care, as this way works just fine.
2024-02-15 23:54:10 +00:00
rlaphoenix a8a89aab9c Aria2c: Fallback to an empty list if stopped_downloads is None
This was fine during most testing in the `for` loop below it, but there's also a len() a bit below that.
2024-02-15 23:45:44 +00:00
rlaphoenix 837015b4ea HLS: Fix incorrect last segment i when decrypting first segment 2024-02-15 23:44:00 +00:00
rlaphoenix 1f11ed258b DASH: Update progress bar when merging segments 2024-02-15 20:06:42 +00:00
rlaphoenix 4e12b867f1 Aria2c: Improve download progress and error handling 2024-02-15 19:19:37 +00:00
rlaphoenix e8b07bf03a DASH: Don't set Range Header if no bytes range value
This caused a HTTP 501 Not Implemented on some CDNs.
2024-02-15 19:10:52 +00:00
rlaphoenix 630a9906ce Rework the Aria2c Downloader
- Downloads are now multithreaded directly in the downloader.
- Now reuses connections instead of having to close and reopen connections for every single download.
- Progress updates are now yielded back to the caller instead of drilling down a progress callable.
- Instead of parsing download progress information in a very hacky way from the stdout stream, use aria2's RPC interface.
- Added a new utility get_free_port which is needed to choose aria2's RPC port as I do not want to use the default port in case the user is already using this port for another tool or reason. Also, to try mitigate port scanning attacks that target aria2 RPC ports.
- The config entry `aria2c.max_concurrent_downloads` is now actually used by aria2c when downloading.
- The `--max-concurrent-downloads` option and config value now defaults to `min(32,(cpu_count+4))` (usually around 16 for above average systems) instead of 5.
- Automated pproxy proxy rerouter is made via subprocess instead of trying to re-do what the pproxy entry point does for us, less code, less trouble, and was ultimately easier to implement.
2024-02-15 17:26:39 +00:00
rlaphoenix 2b7fc929f6 Rework the HLS downloader, add support for new downloaders
- It now downloads all segment files multi-threaded first before any decryption or merging operations (excluding init data, which will be downloaded in sequence/order after all the segments are downloaded)
- Once all segments are downloaded it then starts to go through and do any merging/decryption/init data stuff/e.t.c afterwards.
- Segments are no longer decrypted one by one. If segments use the same EXT-X-KEY data, then they will be merged together and then decrypted. This should see a noticeable speed increase for Widevine DRM.
2024-02-15 17:26:39 +00:00
rlaphoenix e5a330df7e Add support for the new Downloaders to direct URLs 2024-02-15 17:26:39 +00:00
rlaphoenix a1ed083b74 Add support for the new Downloaders to DASH 2024-02-15 17:26:39 +00:00
rlaphoenix 0e96d18af6 Rework the Requests and Curl-Impersonate Downloaders
- Downloads are now multithreaded directly in the downloader.
- Requests and Curl-Impersonate use one singular Session for all downloads, keeping connections alive and cached so it doesn't have to close and reopen connections for every single download.
- Progress updates are now yielded back to the caller instead of drilling down a progress callable.
2024-02-15 17:26:39 +00:00
rlaphoenix 709901176e Use CRC32 instead of MD5 for Track IDs in DASH/HLS 2024-02-15 10:56:51 +00:00
rlaphoenix bd185126b6 HLS: Skip merging continuity if all segments were skipped
If all segments of a continuity is skipped, i.e. by OnSegmentFilter, then this code fails as the folder wouldn't exist.
2024-02-13 17:03:42 +00:00
rlaphoenix cd194e3192 Add new Track Event, OnSegmentDownloaded
Like OnDownloaded but called every time a DASH or HLS segment is downloaded. The path to the downloaded segment file is passed to the callable.
2024-02-10 18:10:09 +00:00
rlaphoenix 87779f4e7d Move Track OnDownloaded event before decryption 2024-02-10 18:05:35 +00:00
rlaphoenix a98d1d98ac Add a new Subtitle Track Event, OnConverted
This runs after a Subtitle has been converted to another format, and only if it was converted. It is passed the new subtitle format codec value.
2024-02-10 18:05:35 +00:00
rlaphoenix c18fe5706b Pass DRM and Segment objects to Track OnDecrypted event 2024-02-10 17:48:26 +00:00
rlaphoenix 439e376b38 No longer pass the track through track events
If you are setting a callable onto a track event, then you have access to the track variable, so just include/use that in your lambda/callable.
2024-02-10 17:47:12 +00:00
rlaphoenix 7be24a130d Give some documentation on Track events 2024-02-10 17:19:48 +00:00
rlaphoenix 8bf6e4d87e Improve typing of Chapters constructor 2024-02-10 12:47:14 +00:00
rlaphoenix 92e00ed667 Fix OGM Chapter Regex patterns in Chapters class 2024-02-10 12:42:17 +00:00
rlaphoenix 66edf577f9 Allow Chapter Timestamp to be float, fix typing 2024-02-10 12:35:02 +00:00
rlaphoenix a544b1e867 Merge HLS segments first by discontinuity then via FFmpeg
HLS playlists where each segment is in an mp4 container seems to corrupt when the EXT-X-MAP is changed out, unless you first merge segments by discontinuity and then merge the merges via FFmpeg (which demuxes all the merged segment continuities and then concatanates them together, probably giving it new init data too).
2024-02-09 08:33:17 +00:00
rlaphoenix 167b45475e Only decode text direction entities in Sub files
Previously, all entities were decoded in Subtitle files because of a problem with SubtitleEdit and it's /ReverseRtlStartEnd option not being entity-aware.

It actually ends up reversing the `;` of `&rlm;`, instead of the actual value of `&rlm;`. Therefore, I decoded all entities before SubtitleEdit could have processed the Subtitle, but this has caused problems with more advanced formats like TTML and WebVTT as `&lt;` would decode to `<` causing syntax errors, among other problematic characters.

According to the TTML and WebVTT spec, html entity encoding is allowed, and that makes sense or you wouldn't be able to use `<` etc. Any failure for players to show the decoded character would be a player problem and be out of scope with Devine.
2024-02-05 12:37:21 +00:00
rlaphoenix 568cb616df Use /ConvertColorsToDialog when converting subs to SRT format
This is because SubtitleEdit keeps color-related information when converting to SRT from WebVTT, TTML, and such formats. Why? Not 100% sure. Maybe some players support colors, but generally if you are using SubRip, it's because you either only want basic text subs, or your player doesn't support these "fancy" ooh-la-la colors.

This is a better solution to just stripped out the information. As the option name suggests, it isn't just removing the color information but rather using it to detect different speakers, then appropriately "dialogify" the captions when needed. I.e., start each speaker's sentence with `- `, and separate them with a new line.

The dash-style dialog formatting is quite vital to know if a caption is all spoken by one speaker versus multiple. Not particularly necessary for non-SDH captioning, but would be wanted for SDH subtitles.
2024-02-05 12:10:33 +00:00
rlaphoenix 3b62b50e25 Add support for SegmentBase and BaseURL-only DASH Manifests 2024-02-05 10:22:40 +00:00
rlaphoenix c06ea4cea8 Rework Chapter System, add `Chapters` class
Overall this commit is to just make working with Chapters a lot less manual and convoluted. The current system has you specify information that can easily be automated, like Chapter order and numbers, which is one of the main changes in this commit.

Note: This is a Breaking change and requires updates to your Service code. The `get_chapters()` method must be updated. For more information see the updated doc-string for `Service.get_chapters()`.

- Added new Chapters class which automatically sorts Chapters by timestamp.
- Chapter class has been significantly reworked to be much more generic. Most operations have been mvoed to the new Chapters class.
- Chapter objects can no longer specify a Chapter number. The number is now automatically set based on it's sorted order in the Chapters object, which is all done automatically.
- Chapter objects can now provide a timestamp in more formats. Timestamp's are now verified more efficiently.
- Chapter objects ID is now a crc32 hash of the timestamp and name instead of just basically their number.
- The Chapters object now also has an ID which is also a crc32 hash of all of the Chapter IDs it holds. This ID can be used for stuff like temp paths.
- `Service.get_chapters()` must now return a Chapters object. The Chapters object may be empty. The Chapters object must hold Chapter objects.
- Using `Chapter {N}` or `Act {N}` Chapters and so on is no longer permitted. You should instead leave the name blank if there's no descriptive name to use for it.
- If you or a user wants `Chapter {N}` names, then they can use the config option `chapter_fallback_name` set to `"Chapter {i:02}"`. See the config documentation for more info.
- Do not add a `00:00:00.000` Chapter, at all. This is automatically added for you if there's at least 1 Chapter with a timestamp after `00:00:00.000`.
2024-02-05 01:42:43 +00:00
rlaphoenix 2affb62ad0 Fix SegmentList source/media join with Base URL in DASH download_track() 2024-02-03 05:26:52 +00:00
rlaphoenix 30abe26321 Improve caching of keys to vaults log 2024-01-29 17:02:30 +00:00