shaka-packager/packager/media/formats/ttml/ttml_generator.h

// Copyright 2020 Google LLC. All rights reserved.
//
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file or at
// https://developers.google.com/open-source/licenses/bsd

#ifndef PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_
#define PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_

#include <list>
#include <map>
#include <string>
#include <unordered_set>

#include <packager/media/base/text_sample.h>
#include <packager/media/base/text_stream_info.h>
#include <packager/mpd/base/xml/xml_node.h>

namespace shaka {
namespace media {
namespace ttml {

class TtmlGenerator {
 public:
  explicit TtmlGenerator();
  ~TtmlGenerator();

  static const char* kTtNamespace;

  void Initialize(const std::map<std::string, TextRegion>& regions,
                  const std::string& language,
                  int32_t time_scale);
  void AddSample(const TextSample& sample);
  void Reset();

  bool Dump(std::string* result) const;

 private:
  bool AddSampleToXml(const TextSample& sample,
                      xml::XmlNode* body,
                      xml::XmlNode* metadata,
                      std::unordered_set<std::string>& fragmentStyles,
                      size_t* image_count) const;
  bool ConvertFragmentToXml(const TextFragment& fragment,
                            xml::XmlNode* parent,
                            xml::XmlNode* metadata,
                            std::unordered_set<std::string>& fragmentStyles,
                            size_t* image_count) const;

  bool addStyling(xml::XmlNode& styling,
                  const std::unordered_set<std::string>& fragmentStyles) const;
  bool addRegions(xml::XmlNode& layout) const;
  std::vector<std::string> usedRegions() const;
  bool isEbuTTTD() const;

  std::list<TextSample> samples_;
  std::map<std::string, TextRegion> regions_;
  std::string language_;
  int32_t time_scale_;
  // This is modified in "const" methods to create unique IDs.
  mutable uint32_t region_id_ = 0;
};

}  // namespace ttml
}  // namespace media
}  // namespace shaka

#endif  // PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`// Copyright 2020 Google LLC. All rights reserved.`
			`//`
			`// Use of this source code is governed by a BSD-style`
			`// license that can be found in the LICENSE file or at`
			`// https://developers.google.com/open-source/licenses/bsd`

			`#ifndef PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_`
			`#define PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_`

			`#include <list>`
			`#include <map>`
			`#include <string>`
feat: teletext formatting (#1384) This PR adds parsing of teletext styling, and rendering of the styling in output TTML and WebVTT subtitle tracks. Beyond unit tests, I've used the sample https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing which has rather advanced subtitling with two separate rows at the same time, where one is left aligned and another is right aligned. This necessitates two parallel cues to be rendered. It also has some colored text. Solve #1335. ## parse teletext styling and formatting Extend the teletext parser to parse the teletext styling and formatting. This includes translating rows into regions, calculating alignment from start and stop position of the text, and extracting text and background colors. The colors are limited to full lines. Both lines and regions are propagated in the TextSample structures. This is because the number of lines may differ from different sources. For teletext, there are 24 rows, but they are essentially always used with double height, so the number of output lines is 12 from 0 to 11. There are also corresponding regions are denoted "ttx_R", where R is an integer row number. A renderer can use either the line number or the region ID to render the text. ## ttml generation for teletext to EBU-TT-D Add support to render teletext input in EBU-TT-D (IMSC-1) format. This includes appropriate regions ttx_0 to ttx_11 signalled in the TextSamples, alignment and text and background colors. The general TTML output has been changed to always include metadata, layout, and styling nodes, even if they are empty. EBU-TT-D is detected by the presence of "ttx_?" regions in the samples. If detected, extra TTML elements will be added and the EBU-TT-D linePadding used as well. Appropriate styles for background and text colors are generated depending on the color and backgroundColor attributes in the text fragments. ## adapt WebVTT output to teletext TextSample. Teletext input generates both a region with prefix ttx_ and a floating point line number (e.g. 9.5) in the range 0 to 11.5 (due to input 0-23 as double lines). The output is adopted to drop such regions and convert the line number to an integer since the standard only used floats for percent values but not for plain line numbers. 2024-04-29 17:33:03 +00:00			`#include <unordered_set>`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00
feat!: Rewrite build system and third-party dependencies (#1310) This work was done over ~80 individual commits in the `cmake` branch, which are now being merged back into `main`. As a roll-up commit, it is too big to be reviewable, but each change was reviewed individually in context of the `cmake` branch. After this, the `cmake` branch will be renamed `cmake-porting-history` and preserved. --------- Co-authored-by: Geoff Jukes <geoffjukes@users.noreply.github.com> Co-authored-by: Bartek Zdanowski <bartek.zdanowski@gmail.com> Co-authored-by: Carlos Bentzen <cadubentzen@gmail.com> Co-authored-by: Dennis E. Mungai <2356871+Brainiarc7@users.noreply.github.com> Co-authored-by: Cosmin Stejerean <cstejerean@gmail.com> Co-authored-by: Carlos Bentzen <carlos.bentzen@bitmovin.com> Co-authored-by: Cosmin Stejerean <cstejerean@meta.com> Co-authored-by: Cosmin Stejerean <cosmin@offbytwo.com> 2023-12-01 17:32:19 +00:00			`#include <packager/media/base/text_sample.h>`
			`#include <packager/media/base/text_stream_info.h>`
			`#include <packager/mpd/base/xml/xml_node.h>`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00
			`namespace shaka {`
			`namespace media {`
			`namespace ttml {`

			`class TtmlGenerator {`
			`public:`
			`explicit TtmlGenerator();`
			`~TtmlGenerator();`

Add TTML-in-MP4 output support. This changes the default MP4 output to use TTML and adds a way to choose which one is used. This is done with 'format=ttml+mp4' or 'format=vtt+mp4'. This also fixes the boxes output in WebVTT in MP4. Change-Id: Ieaa7fc44fbf4dc020a5bb70cfa3578ec10e088ce 2020-10-13 21:43:18 +00:00			`static const char* kTtNamespace;`

Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`void Initialize(const std::map<std::string, TextRegion>& regions,`
			`const std::string& language,`
cleanup: Convert all time parameters to signed This converts all time parameters to signed, finishing a cleanup that was started in 2018 in b4256bf0. This changes the type of: - timestamps - PTS specifically - timestamp offsets - timescales - durations This excludes: - MP4 box definitions - DTS specifically This is meant to address signed/unsigned conversion issues on arm64 that caused some test cases to fail. Change-Id: Ic752a20cbc6e31fea6bc0894d1771833171e7cbe 2021-08-04 18:56:44 +00:00			`int32_t time_scale);`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`void AddSample(const TextSample& sample);`
			`void Reset();`

			`bool Dump(std::string* result) const;`

			`private:`
Add background image to TextSample and TTML output Issue #832 Change-Id: I50f23223fa4362559087ada9b40488c089594450 2020-11-20 21:03:16 +00:00			`bool AddSampleToXml(const TextSample& sample,`
			`xml::XmlNode* body,`
			`xml::XmlNode* metadata,`
feat: teletext formatting (#1384) This PR adds parsing of teletext styling, and rendering of the styling in output TTML and WebVTT subtitle tracks. Beyond unit tests, I've used the sample https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing which has rather advanced subtitling with two separate rows at the same time, where one is left aligned and another is right aligned. This necessitates two parallel cues to be rendered. It also has some colored text. Solve #1335. ## parse teletext styling and formatting Extend the teletext parser to parse the teletext styling and formatting. This includes translating rows into regions, calculating alignment from start and stop position of the text, and extracting text and background colors. The colors are limited to full lines. Both lines and regions are propagated in the TextSample structures. This is because the number of lines may differ from different sources. For teletext, there are 24 rows, but they are essentially always used with double height, so the number of output lines is 12 from 0 to 11. There are also corresponding regions are denoted "ttx_R", where R is an integer row number. A renderer can use either the line number or the region ID to render the text. ## ttml generation for teletext to EBU-TT-D Add support to render teletext input in EBU-TT-D (IMSC-1) format. This includes appropriate regions ttx_0 to ttx_11 signalled in the TextSamples, alignment and text and background colors. The general TTML output has been changed to always include metadata, layout, and styling nodes, even if they are empty. EBU-TT-D is detected by the presence of "ttx_?" regions in the samples. If detected, extra TTML elements will be added and the EBU-TT-D linePadding used as well. Appropriate styles for background and text colors are generated depending on the color and backgroundColor attributes in the text fragments. ## adapt WebVTT output to teletext TextSample. Teletext input generates both a region with prefix ttx_ and a floating point line number (e.g. 9.5) in the range 0 to 11.5 (due to input 0-23 as double lines). The output is adopted to drop such regions and convert the line number to an integer since the standard only used floats for percent values but not for plain line numbers. 2024-04-29 17:33:03 +00:00			`std::unordered_set<std::string>& fragmentStyles,`
Add background image to TextSample and TTML output Issue #832 Change-Id: I50f23223fa4362559087ada9b40488c089594450 2020-11-20 21:03:16 +00:00			`size_t* image_count) const;`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`bool ConvertFragmentToXml(const TextFragment& fragment,`
Add background image to TextSample and TTML output Issue #832 Change-Id: I50f23223fa4362559087ada9b40488c089594450 2020-11-20 21:03:16 +00:00			`xml::XmlNode* parent,`
			`xml::XmlNode* metadata,`
feat: teletext formatting (#1384) This PR adds parsing of teletext styling, and rendering of the styling in output TTML and WebVTT subtitle tracks. Beyond unit tests, I've used the sample https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing which has rather advanced subtitling with two separate rows at the same time, where one is left aligned and another is right aligned. This necessitates two parallel cues to be rendered. It also has some colored text. Solve #1335. ## parse teletext styling and formatting Extend the teletext parser to parse the teletext styling and formatting. This includes translating rows into regions, calculating alignment from start and stop position of the text, and extracting text and background colors. The colors are limited to full lines. Both lines and regions are propagated in the TextSample structures. This is because the number of lines may differ from different sources. For teletext, there are 24 rows, but they are essentially always used with double height, so the number of output lines is 12 from 0 to 11. There are also corresponding regions are denoted "ttx_R", where R is an integer row number. A renderer can use either the line number or the region ID to render the text. ## ttml generation for teletext to EBU-TT-D Add support to render teletext input in EBU-TT-D (IMSC-1) format. This includes appropriate regions ttx_0 to ttx_11 signalled in the TextSamples, alignment and text and background colors. The general TTML output has been changed to always include metadata, layout, and styling nodes, even if they are empty. EBU-TT-D is detected by the presence of "ttx_?" regions in the samples. If detected, extra TTML elements will be added and the EBU-TT-D linePadding used as well. Appropriate styles for background and text colors are generated depending on the color and backgroundColor attributes in the text fragments. ## adapt WebVTT output to teletext TextSample. Teletext input generates both a region with prefix ttx_ and a floating point line number (e.g. 9.5) in the range 0 to 11.5 (due to input 0-23 as double lines). The output is adopted to drop such regions and convert the line number to an integer since the standard only used floats for percent values but not for plain line numbers. 2024-04-29 17:33:03 +00:00			`std::unordered_set<std::string>& fragmentStyles,`
Add background image to TextSample and TTML output Issue #832 Change-Id: I50f23223fa4362559087ada9b40488c089594450 2020-11-20 21:03:16 +00:00			`size_t* image_count) const;`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00
feat: teletext formatting (#1384) This PR adds parsing of teletext styling, and rendering of the styling in output TTML and WebVTT subtitle tracks. Beyond unit tests, I've used the sample https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing which has rather advanced subtitling with two separate rows at the same time, where one is left aligned and another is right aligned. This necessitates two parallel cues to be rendered. It also has some colored text. Solve #1335. ## parse teletext styling and formatting Extend the teletext parser to parse the teletext styling and formatting. This includes translating rows into regions, calculating alignment from start and stop position of the text, and extracting text and background colors. The colors are limited to full lines. Both lines and regions are propagated in the TextSample structures. This is because the number of lines may differ from different sources. For teletext, there are 24 rows, but they are essentially always used with double height, so the number of output lines is 12 from 0 to 11. There are also corresponding regions are denoted "ttx_R", where R is an integer row number. A renderer can use either the line number or the region ID to render the text. ## ttml generation for teletext to EBU-TT-D Add support to render teletext input in EBU-TT-D (IMSC-1) format. This includes appropriate regions ttx_0 to ttx_11 signalled in the TextSamples, alignment and text and background colors. The general TTML output has been changed to always include metadata, layout, and styling nodes, even if they are empty. EBU-TT-D is detected by the presence of "ttx_?" regions in the samples. If detected, extra TTML elements will be added and the EBU-TT-D linePadding used as well. Appropriate styles for background and text colors are generated depending on the color and backgroundColor attributes in the text fragments. ## adapt WebVTT output to teletext TextSample. Teletext input generates both a region with prefix ttx_ and a floating point line number (e.g. 9.5) in the range 0 to 11.5 (due to input 0-23 as double lines). The output is adopted to drop such regions and convert the line number to an integer since the standard only used floats for percent values but not for plain line numbers. 2024-04-29 17:33:03 +00:00			`bool addStyling(xml::XmlNode& styling,`
			`const std::unordered_set<std::string>& fragmentStyles) const;`
			`bool addRegions(xml::XmlNode& layout) const;`
			`std::vector<std::string> usedRegions() const;`
			`bool isEbuTTTD() const;`

Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`std::list<TextSample> samples_;`
			`std::map<std::string, TextRegion> regions_;`
			`std::string language_;`
cleanup: Convert all time parameters to signed This converts all time parameters to signed, finishing a cleanup that was started in 2018 in b4256bf0. This changes the type of: - timestamps - PTS specifically - timestamp offsets - timescales - durations This excludes: - MP4 box definitions - DTS specifically This is meant to address signed/unsigned conversion issues on arm64 that caused some test cases to fail. Change-Id: Ic752a20cbc6e31fea6bc0894d1771833171e7cbe 2021-08-04 18:56:44 +00:00			`int32_t time_scale_;`
Add support for text cue heights. Issue #832 Change-Id: Ifccbd6c6c46916d3d28ac4afaba01fc158c9c361 2020-12-01 19:32:39 +00:00			`// This is modified in "const" methods to create unique IDs.`
			`mutable uint32_t region_id_ = 0;`
Add TTML text output. This only supports TTML output; meaning the user can convert WebVTT into TTML, but not the other way around. This will be useful for DVB-sub subtitles that would be better supported within TTML. This only adds text-based output; a follow-up will add MP4 support. Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8 2020-10-08 21:46:37 +00:00			`};`

			`} // namespace ttml`
			`} // namespace media`
			`} // namespace shaka`

			`#endif // PACKAGER_MEDIA_FORMATS_TTML_TTML_GENERATOR_H_`