Commit Graph

555 Commits

Author SHA1 Message Date
rlaphoenix 7cc7227f8c Specify utf8 with SubtitleEdit when stripping hearing impaired 2023-12-29 16:02:10 +00:00
rlaphoenix d94d6042b7 Fix Chapter Encoding on Windows when muxing with mkvmerge
On Windows it seems to default to some encoding other than UTF-8 (possibly UTF-16 or CP-1252) and since the chapter file is saved as UTF-8, it breaks characters outside typical range. Like ø, æ, and other stuff.
2023-12-03 15:04:58 +00:00
rlaphoenix 308ddbd394 Improve private forking instructions in README 2023-12-03 00:17:04 +00:00
rlaphoenix 7cec16d8ab Validate track languages in HLS.to_tracks 2023-12-02 22:40:41 +00:00
rlaphoenix 86635f9b7f Add Support for Python 3.12, update dependencies 2023-12-02 21:17:41 +00:00
rlaphoenix 8cd6dfb65a Implement `--sub-format` in dl to set output subtitle format
The default is still SubRip SRT, but you can now change the output format to almost any of the available Codec options. There is no option to leave the subtitle format as-is yet. I.e., if there's a SRT and WebVTT subtitle, leave them both as-is.

Like always, you can configure a default in your config file, e.g.,

```yaml
dl:
  sub_format: vtt
```

Note though that SSA, SSAv4, fTTML, and fVTT are not yet supported. There are no plans to support fTTML or fVTT.
2023-12-02 17:56:40 +00:00
rlaphoenix e87de50940 Exclude fragmented Sub Codecs from DASH UTF-8 checks
Chardet was detecting a mixture of mostly cp1252 and MacRoman encoding, where it should just be left as-is when parsing. The actual text within it perhaps may want to go through `try_ensure_utf8` when parsed, but not the entire box.
2023-12-02 17:44:47 +00:00
rlaphoenix 0be62541ba Handle chardet returning `None` as encoding 2023-12-02 15:10:00 +00:00
Shivelight c31ee338dc
Add option for automatic subtitle character encoding normalization (#68)
* Add option for automatic subtitle character encoding normalization

The rationale behind this function is that some services use ISO-8859-1
(latin1) or Windows-1252 (CP-1252) instead of UTF-8 encoding, whether
intentionally or accidentally. Some services even stream subtitles with
malformed/mixed encoding (each segment has a different encoding).

* Remove Subtitle parameter `auto_fix_encoding`

Just always attempt to fix encoding. If the subtitle is neither UTF-8 nor CP-1252, then it should realistically error out instead of producing garbage Subtitle data anyway.

* Move Subtitle encoding fixing code out of if drm tree

* Use chardet as a last ditch effort fixing Subs, or return original data

* Move Subtitle.fix_encoding method to utilities as try_ensure_utf8

* Add Shivelight as a contributor

---------

Co-authored-by: rlaphoenix <rlaphoenix@pm.me>
2023-12-02 11:00:55 +00:00
rlaphoenix 4b8cfabaac Fix all Ruff and isort linter errors 2023-12-02 09:57:13 +00:00
rlaphoenix 959590a6bb Overhaul tooling, linting, editor configs, and README 2023-12-02 09:57:13 +00:00
rlaphoenix c159672181 Update Video.Range.from_cicp with changes in H.Sup19 (04/21)
Note: There is some breaking changes here. If you manually worked with the Enum names here, then some of them have changed to better reflect the code points usage.

Generally speaking it should not affect service code.
2023-09-04 00:48:50 +01:00
rlaphoenix aff40df7d1 Raise CalledProcessError if Shaka logs an error
This seems to be necessary as Shaka-packager seems to always return exit code 0, even on errors.
2023-07-15 18:13:24 +01:00
rlaphoenix f3cfaa3ab3 Fix DASH FPS error when SegmentBase is not found 2023-07-15 18:08:01 +01:00
rlaphoenix 883c9ae063 docs: Add Discord badge to README 2023-07-09 14:41:11 +01:00
rlaphoenix a31cb6aa2f deps: Update all dependencies 2023-07-07 18:20:49 +01:00
rlaphoenix bfceb15f14 docs: Remove portable installation steps and info
I'm not happy with the approach used here to make portable installations of Devine, therefore for now I will remove the information relating to portable installations.
2023-07-04 03:03:07 +01:00
rlaphoenix 9aafa3d8df Add missing cookies param on aria2c function recursion 2023-06-01 00:40:13 +01:00
rlaphoenix a01766c60b Remove the saldl downloader 2023-05-31 23:04:48 +01:00
rlaphoenix d369e6134c Add function to fix Start/End Chars on Subtitles 2023-05-30 20:22:40 +01:00
rlaphoenix 6cfbaa7db1 Pass cookies to the aria2c and requests downloaders
For aria2c I've simplified the operation by offloading most of the work for creating a cookie header by just re-doing what Python-requests does. This results in the exact same cookies Python-requests would have used in a requests.get() call or such. It supports multiple of the same-name cookies under different domains/paths based on the URI of the mock request.
2023-05-29 22:23:39 +01:00
rlaphoenix 1ff4858ca7 Fix mistake in Web Address for FFmpeg in README 2023-05-28 19:46:55 +01:00
rlaphoenix fd52073605 Skip merging of HLS segments if `--skip-dl` is used
Partially fixes #61
2023-05-27 20:20:07 +01:00
rlaphoenix 89f5e04348 Bump requests from 2.28.2 to 2.31.0 2023-05-27 20:15:51 +01:00
rlaphoenix 57af8d98c9 Add --video-only flag to dl command 2023-05-26 11:16:12 +01:00
rlaphoenix 215730663b Allow --audio/subs/chapters-only to be used simultaneously
E.g., if you only wanted the subs and chapters, this would now be possible with `--subs-only --chapters-only`.
2023-05-26 11:15:38 +01:00
rlaphoenix 6a9598021d Re-raise errors when loading WVD files so it's more understandable
It also looks for the "expected 2 but parsed 1" error which is likely an error while parsing the WVD version field. If this happens, it will inform the user to use `pywidevine migrate`.
2023-05-25 04:45:49 +01:00
rlaphoenix a24633fe61 List available Services on error
This is mainly to lessen confusion on service name typo's or new users getting used to the CLI.

It also changed the Exceptions on the methods of Service from ClickException to a KeyError since they are intended to be used on the core codebase outside of the context of Click.
2023-05-25 04:37:17 +01:00
rlaphoenix df2f9b85ae Use urljoin instead of an if check and + op in HLS
This used to be used even before devine was public, but it was constantly changed back and forth between an urljoin(), another form of urljoin (something custom or something I can't remember), and an if check + addition.

However, I can confirm that a simple if check will not work as the Base URI might not even be in the same relative root. The if checks have also been inconsistent with some checking if it starts with http(s)://, and some checking if it does not have the base URI at the start of the string.

This if check method does not work as well as an urljoin() has the potential to. It also fixes some services as some HLS playlists would have the m3u8 URL on a completely different root, subdomain, or even domain, causing it to completely break when trying to download segments.
2023-05-21 00:06:30 +01:00
rlaphoenix 301c026ca9 Remove Smart/Fancy Left/Right Quotation Marks from Filename Sanitizer 2023-05-20 22:10:55 +01:00
rlaphoenix 8df04de1ea Remove file size check from Requests downloader
We cannot actually do this check. The Content-Length value will be the size after being further encoded or compressed. While we can find out what it was compressed with via the Content-Encoding header, we cannot match the downloaded length with the Content-Length header as requests will automatically decompress/decode according to the Content-Encoding header.
2023-05-19 22:11:05 +01:00
rlaphoenix 8ada6165e3 Set stop event & mark track failed if DASH DRM fails to license 2023-05-19 19:07:35 +01:00
rlaphoenix 6e844409ae Set stop event & mark track failed if HLS Session DRM fails to license 2023-05-19 19:07:06 +01:00
rlaphoenix c9ecab444f Use range offset when calculating HLS init map byte ranges 2023-05-19 18:38:33 +01:00
rlaphoenix 3e0b7ef200 Fix regression where Range header is accidentally kept and re-used 2023-05-19 00:35:46 +01:00
rlaphoenix 8e7a63f0b9 Fix the file move in `wvd add` when the WVDs folder does not exist
On new installs, or where the `WVDs` folder is not made yet, then the shutil.move() assumes it's a file path and moves the `.wvd` file to the WVDs folder path, as a file. If the folder existed but was empty, this error wouldn't have occurred.
2023-05-19 00:35:46 +01:00
rlaphoenix 55a86ac6c9 Fix filesize.decimal call in requests downloader size exception 2023-05-17 03:32:08 +01:00
rlaphoenix dd64212ad2 Move download_segment() from DASH/HLS download_track() to Class
Various overall small readability improvements have also been made.
2023-05-17 03:20:01 +01:00
rlaphoenix 03c012f88e Move the Downloaded msg after Decrypt mgs in DASH/URL downloads 2023-05-17 02:09:16 +01:00
rlaphoenix 6cdde3efb0 Override the downloader more efficiently in DASH/HLS when Range is used 2023-05-17 01:33:06 +01:00
rlaphoenix 6d4be8620c Only write segment data if the tfhd fix was necessary in DASH 2023-05-17 01:22:59 +01:00
rlaphoenix 681d69d5e5 Mark DASH and URL tracks as Decrypting when using shaka
DASH and normal URL downloads now both decrypt one large single or merged file after all downloads are finished. This leaves a bit of a "pause" between progress bar movement which looks a bit odd. So mark the track as being in a Decrypting state.
2023-05-16 22:01:07 +01:00
rlaphoenix a45c784569 Replace download speeds with "Downloaded" text when finished 2023-05-16 21:59:03 +01:00
rlaphoenix 2a8307b98d Decrypt DASH downloads after merging all segments
Since DASH doesn't have the ability to change keys dynamically per-track (Representation), there's no need for the DASH downloader to decrypt segments as they are downloaded (like HLS).

This halves the amount of processes needing to be opened as well as the I/O usage. It may result in noticeably lower CPU usage. Since the IOPS is lowered, you may even see an increase in download speed if downloading to something like a meh HDD.

This also fixes decryption in some weird edge-cases where decrypting each segment individually resulted in timestamp anomalies causing shaka to fail.
2023-05-16 21:55:53 +01:00
rlaphoenix bdc1203514 Only verify download size in requests downloader if possible
Some Servers may not response with the Content-Length header, even if it's from segmented media. I.e., if it's a subtitle URL. The requests downloader required the header to be present as it downloads each URL, which is not possible.

Now it tries to get it if possible, and verifies the download size with the Content-Length value if it could be obtained.
2023-05-16 20:49:43 +01:00
rlaphoenix 2a4e9505f1 Remove unnecessary HEAD calls in requests downloader
HEAD requests were made to sum a total file size of the download operation. However, the downloader is may be used on URLs where the content is not segmented media. Therefore, the server may not support or respond with the Content-Length header which causes the requests downloader to crash before it even gets a chance to begin downloading.

Even still, this total size value isn't really necessary, and would cause possibly 100s of HEAD requests (in quick succession of each other) on segmented sources. It would also add up-front delay before it actually starts to download.
2023-05-16 20:47:26 +01:00
rlaphoenix e7dc138c0f Improve readability and documentation of DASH's to_tracks function 2023-05-15 16:19:53 +01:00
rlaphoenix e079febe79 Ensure output directory exists in requests downloader 2023-05-15 13:33:59 +01:00
rlaphoenix 95802d1e64 Fix regression with downloader mapper on aria2c and saldl
The setup I had for using asyncio.run with functools.partial didn't actually pan out. A full pass-through lambda is required.

I've also moved the mapped downloader variable to the root of the downloaders package.
2023-05-12 12:19:34 +01:00
rlaphoenix be403bbff4 Implement a Python-requests-based downloader 2023-05-12 07:02:39 +01:00