Support Dolby Vision profile 8.1, 8.2, 8.4, 10.1, 10.4 signaling in HLS
and DASH.
Adds new option `--use_dovi_supplemental_codecs` (off by default) to use
SUPPLEMENTAL-CODECS in HLS and `scte214:supplementalCodecs` and
`scte214:supplementalProfiles` for DASH.
To maintain compatibility with existing players the current behavior of
using two entries in the manifest remains the default. This will be
changed in a future version where `use_dovi_supplemental_codecs` will
become on by default.
Adds Dolby Vision compatible brands, 'db1p', 'db2g', 'db4g', 'db4h',
'dby1' based on https://mp4ra.org/#/brands
---------
Co-authored-by: Xingzhao Yun <xyun@dolby.com>
Set the start number in representation to the segment index that is sent by muxer.
With this enhancement, you can now specify the initial sequence number
to be used on the generated segments when calling the packager.
With the old implementation, it was always starting with "1".
---------
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
This PR adds parsing of teletext styling, and rendering of the styling
in output TTML and WebVTT subtitle tracks.
Beyond unit tests, I've used the sample
https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing
which has rather advanced subtitling with two separate rows at the same
time, where one is left aligned and another is right aligned. This
necessitates two parallel cues to be rendered. It also has some colored
text.
Solve #1335.
## parse teletext styling and formatting
Extend the teletext parser to parse the teletext styling and formatting.
This includes translating rows into regions, calculating alignment
from start and stop position of the text, and extracting text and
background colors.
The colors are limited to full lines.
Both lines and regions are propagated in the TextSample structures.
This is because the number of lines may differ from different sources.
For teletext, there are 24 rows, but they are essentially always
used with double height, so the number of output lines is 12
from 0 to 11.
There are also corresponding regions are denoted "ttx_R",
where R is an integer row number. A renderer can use either
the line number or the region ID to render the text.
## ttml generation for teletext to EBU-TT-D
Add support to render teletext input in EBU-TT-D (IMSC-1) format.
This includes appropriate regions ttx_0 to ttx_11 signalled
in the TextSamples, alignment and text and background colors.
The general TTML output has been changed to always include
metadata, layout, and styling nodes, even if they are empty.
EBU-TT-D is detected by the presence of "ttx_?" regions in the
samples. If detected, extra TTML elements will be added and
the EBU-TT-D linePadding used as well.
Appropriate styles for background and text colors are generated
depending on the color and backgroundColor attributes in the
text fragments.
## adapt WebVTT output to teletext TextSample.
Teletext input generates both a region with prefix ttx_
and a floating point line number (e.g. 9.5) in the
range 0 to 11.5 (due to input 0-23 as double lines).
The output is adopted to drop such regions
and convert the line number to an integer
since the standard only used floats for percent
values but not for plain line numbers.
Fixes#1356 which was caused by the fix in #1281 which updated this to
use the correct FairPlay system ID. However since old versions
recognized the previous system ID this restores support for it to avoid
breaking clients.
feat: Added audio specific configuration udts box to AudioSampleEntry
for MP4 input/output. DASH tags for DTS audio as specified in ETSI TS
103 491 and ETSI TS 102 114.
Closes#1301
---------
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
Part of https://github.com/shaka-project/shaka-packager/issues/369
This adds read support for some MPEG-TS PMT elementary stream
descriptors:
- ISO639 Language Descriptor providing language code and audio type
- Maximum Bitrate Descriptor providing peak stream bandwidth
Those metadata are propagated to StreamInfo structures:
- StreamInfo.language field
- AudioStreamMetadata.max_bitrate field for audio streams
- audio type is currently not propagated - corresponding field has to be
added to AudioStreamMetadata
Test vector file containing those descriptors is provided.
The current mbedtls integration was not working for some modes. See for
example #1316 and also lots of failing integration tests.
For example in pattern encryptor it works on one block at a time so it
cannot assume it's going to always get a buffer with a padding for an
extra block.
From what I can tell when the padding mode is correctly set to
`MBEDTLS_PADDING_NONE` there is no extra block being written to or
required.
This passes all crypto unit tests and integration tests.
Closes#1316
This work was done over ~80 individual commits in the `cmake` branch,
which are now being merged back into `main`. As a roll-up commit, it is
too big to be reviewable, but each change was reviewed individually in
context of the `cmake` branch. After this, the `cmake` branch will be
renamed `cmake-porting-history` and preserved.
---------
Co-authored-by: Geoff Jukes <geoffjukes@users.noreply.github.com>
Co-authored-by: Bartek Zdanowski <bartek.zdanowski@gmail.com>
Co-authored-by: Carlos Bentzen <cadubentzen@gmail.com>
Co-authored-by: Dennis E. Mungai <2356871+Brainiarc7@users.noreply.github.com>
Co-authored-by: Cosmin Stejerean <cstejerean@gmail.com>
Co-authored-by: Carlos Bentzen <carlos.bentzen@bitmovin.com>
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
Co-authored-by: Cosmin Stejerean <cosmin@offbytwo.com>
CC version 13 needs `<cstdint>` to be explicitly included to
provide fixed bits integer types.
Some files using it inludes `<stdint.h>`, some are missing direct or
undirect inclusion. This PR adds `<cstdint>` inclusion to the
minimal set of files, allowing compilation on GCC 13.
Closes#1305
As per the AV1 spec, the codec string may contain optional color values.
This extracts the missing color information from the mp4 `colr` atom, if
present, and generates the full AV1 codec string.
Closes#1007
# LL-DASH Support
These changes add support for LL-DASH streaming.
**NOTE:** LL-HLS support is still in progress, but it's coming. :)
## Testing
`./chunking_unittest --gtest_filter="ChunkingHandlerTest.LowLatencyDash"`
`./media_event_unittest --gtest_filter="MpdNotifyMuxerListenerTest.LowLatencyDash"`
`./mpd_unittest --gtest_filter="PeriodTest.LowLatencyDashMpdGetXml"`
`./mpd_unittest --gtest_filter="SimpleMpdNotifierTest.NotifyAvailabilityTimeOffset"`
`./mpd_unittest --gtest_filter="SimpleMpdNotifierTest.NotifySegmentDuration"`
`./mpd_unittest --gtest_filter="LowLatencySegmentTest.LowLatencySegmentTemplate"`
Note, packager_test must be run from the main project directory
`./out/Release/packager_test --gtest_filter="PackagerTest.LowLatencyDashEnabledAndUtcTimingNotSet"`
`./out/Release/packager_test --gtest_filter="PackagerTest.LowLatencyDashEnabledAndUtcTimingNotSet"`
This converts all time parameters to signed, finishing a cleanup that
was started in 2018 in b4256bf0. This changes the type of:
- timestamps
- PTS specifically
- timestamp offsets
- timescales
- durations
This excludes:
- MP4 box definitions
- DTS specifically
This is meant to address signed/unsigned conversion issues on arm64
that caused some test cases to fail.
Change-Id: Ic752a20cbc6e31fea6bc0894d1771833171e7cbe
This also allows setting the language of different text streams from
the same input. Multiple streams can use the same input stream
using different cc_index values and can each use a different language.
This also will try to pull the language from the input if not
specified.
Change-Id: I7078710b509b7d77dad8cb4299a82f954af7e9e7
Issue #149
Co-authored-by: Andreas Motl <andreas.motl@elmyra.de>
Co-authored-by: Rintaro Kuroiwa <rkuroiwa@google.com>
Co-authored-by: Ole Andre Birkedal <o.birkedal@sportradar.com>
Previously if there are no bytes remaining, SkipBytes(0) would fail,
which results in parsing error in
AACAudioSpecificConfig::ParseProgramConfigElement.
Fixes#875.
Change-Id: I271899a37303d0d3fa0cf1bf90f99227058b82df
This changes the default MP4 output to use TTML and adds a way to
choose which one is used. This is done with 'format=ttml+mp4' or
'format=vtt+mp4'.
This also fixes the boxes output in WebVTT in MP4.
Change-Id: Ieaa7fc44fbf4dc020a5bb70cfa3578ec10e088ce
This only supports TTML output; meaning the user can convert WebVTT
into TTML, but not the other way around. This will be useful for
DVB-sub subtitles that would be better supported within TTML.
This only adds text-based output; a follow-up will add MP4 support.
Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8
This adds more generic settings for regions and CSS styles. These are
global settings, so they go on the StreamInfo object.
Change-Id: Ibb76c060206152ccf8e9a067c09877226f67c927
Now text cues are composed of nested fragments that can be individually
styled. This allows portions of the cue to be bold, etc. The
WebVTT parser doesn't parse the inputs, but the original tags are
preserved in WebVTT output. The WebVTT output will add tags if the
style elements are present in the cue object.
Change-Id: I6abba4175e376e4f753193f7d8cac63e958d3c89
Now the Cue settings are a generic object that is parsed in WebVTT.
This will allow setting the settings in different parsers without having
to use WebVTT-specifics.
Change-Id: I36689bec725bd2e515af962b7174fc5977f96fa2
This sets the groundwork for more generic text cues by having a more
generic object for the settings and the body. This also changes the
TextSample to be immutable and accepts the fields in the constructor
instead of using setters.
Change-Id: I76b09ce8e8471a49e6bf447e8c187f867728a4bf
Now text-based WebVTT also uses the generic media pipeline. This
converts the WebVttTextOutputHandler to a WebVttMuxer to be more
consistent with the other muxer types.
This also allows choosing between single-segment text and multi-segment.
Before, we would generate both and use single-segment for DASH and
multi-segment for HLS; but now you can choose between either and either
are supported in both DASH and HLS.
Change-Id: I6f7edda09e01b5f40e819290d3fe6e88677018d9
Now the same pipeline for handling the audio/videos streams will handle
the segmented text streams too. This doesn't apply to the text output,
only to the MP4 variants. This also fixes a bug where we added the
X-TIMESTAMP-MAP tag even when there wasn't TS streams; this doesn't
otherwise change the behavior around that tag.
Change-Id: I03f7cea56efa42e96311c00841330629a14aa053
The test added in the previous CL was broken due to a rebase on another
change. This subtly changed some of the byte offsets that broke the
test. This wasn't caught since I didn't rebase and re-run the tests
before merging.
Change-Id: Id7e4c7688278eae37da1a14f1648263b4dda98cd
In addition to the MediaSample handling of the MediaParser, this now
adds callbacks for TextSample. This allows reading text streams from
the media files.
Change-Id: I6c00e286e98bc9aafe05b99cf2f7ce6f89d167a9
The KeySource now only handles fetching the keys and loading any PSSH
info from the license; it will not handle generating new PSSH info
based on the config.
This will allow the PSSH generation to access to the full
EncryptionConfig so we can add additional options to it.
Issue #756
Change-Id: Ia67387aa3d5ec0d723b7f5f21fc517f64c840393
There were several enum types that all were used for key systems. This
combines them into one to make it more clear and only needing to update
one. This also uses a bit field to specify multiple key systems instead
of using a std::vector.
Change-Id: Ia88039835492a5bd47f449ba4b76187046deeec0
Call CRYPTO_library_init to properly initialize crypto engine, which
enables AES-NI (Hardware AES) if it is supported by CPU.
Also added a performance benchmark test.
Closes#198.
Change-Id: I962a2da588d2f4f6cbe00c83ecc9a832db0e6042
- Parse and extract transfer_characteristics from H264/H265 VUI
parameters.
- Set VIDEO-RANGE attribute in HLS according to HLS specification:
https://tools.ietf.org/html/draft-pantos-hls-rfc8216bis-02#section-4.4.4.2
- Also added an end to end test.
Fixes#632.
Change-Id: Iadf557d967b42ade321fb0b152e8e7b64fe9ff3e
- Add relevant FOURCCs for Dolby Vision.
- Parse DOVIDecoderConfigurationRecord (dvcC, dvvC) to generate
Dolby Vision codec string.
- Propagate Dolby Vision configs (dvcC, dvvC, hvcE) from Demuxer
to Muxer.
- Add a Dolby Vision end to end test.
Support for backward compatibility signaling in DASH and HLS will be
added in a later CL.
Issue #341
Change-Id: If1385df5f48e04b59cb7661130bea48e26b453bf
Add crypto_period_seconds to Widevine key request
When using key rotation with Widevine DRM, a key server has to know
the duration of crypto period to relate generated keys to the media
playback time. This helps the server to provide relevant keys to
a client during license request.
Closes#544.