home
products
contribute
download
documentation
forum
Home
Forums
New posts
Search forums
What's new
New posts
All posts
Latest activity
Members
Registered members
Current visitors
Donate
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Search titles only
By:
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
MediaPortal 1
Support
Electronic Program Guide
DVB-S EPG wrong character encoding for Hungarian language
Contact us
RSS
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="mm1352000" data-source="post: 1052608" data-attributes="member: 82144"><p>No problem. It is holiday time so I wasn't expecting an answer in a hurry. <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p></p><p></p><p>Heh, I'm glad I'm not going crazy. <img src="" class="smilie smilie--sprite smilie--sprite8" alt=":D" title="Big Grin :D" loading="lazy" data-shortname=":D" /></p><p></p><p></p><p>To be clear, what I was thinking is that handling for three byte encoding is sometimes okay and sometimes not. It completely depends on the value of the encoding byte.</p><p></p><p>If the input is:</p><p>0x10 0x00 [encoding byte <= 0x05] [content]</p><p></p><p>...output will be:</p><p>0x10 [encoding byte <= 0x05] [encoding byte <= 0x05] [content]</p><p></p><p>This situation is okay for TV library. It sees the encoding byte in the correct position. No problem. <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p></p><p>However, if the input is:</p><p>0x10 0x00 [encoding byte > 0x05] [content]</p><p></p><p>...output will be:</p><p>0x10 [encoding byte > 0x05] [content]</p><p></p><p>The encoding byte is not repeated due to this condition:</p><p>[code]else if (((c > 0x05) && (c <= 0x1F)) || ((c >= 0x80) && (c < 0x9F))) //0x1-0x5 = choose character set, must keep this byte![/code]</p><p><a href="https://github.com/MediaPortal/MediaPortal-1/blob/master/DirectShowFilters/DvbCoreUtils/DvbUtil.cpp#L133" target="_blank">https://github.com/MediaPortal/MediaPortal-1/blob/master/DirectShowFilters/DvbCoreUtils/DvbUtil.cpp#L133</a></p><p></p><p>In this case TV library will handle the content using the default encoding, which could be wrong.</p><p></p><p></p><p>Yep, completely agree. <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p></p><p></p><p>Yes, I'd definitely be interested. <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p>I also have some questions already. Mainly I'm trying to understand if there is any reason to keep the Iso6937ToUnicode class.</p><p></p><p>If we want to conform with EN 300 468, is it technically correct to replace this code in TvCardDvbBase.cs:</p><p><a href="https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TVLibrary/Implementations/DVB/Graphs/TvCardDvbBase.cs#L2638" target="_blank">https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TVLibrary/Implementations/DVB/Graphs/TvCardDvbBase.cs#L2638</a></p><p>[code]</p><p> if (language.ToUpperInvariant() == "CZE" || language.ToUpperInvariant() == "CES")</p><p> {</p><p> title = Iso6937ToUnicode.Convert(ptrTitle);</p><p> description = Iso6937ToUnicode.Convert(ptrDesc);</p><p> }</p><p> else</p><p> {</p><p> title = DvbTextConverter.Convert(ptrTitle, "");</p><p> description = DvbTextConverter.Convert(ptrDesc, "");</p><p> }[/code]</p><p></p><p>...simply with:</p><p>[code]</p><p> title = DvbTextConverter.Convert(ptrTitle);</p><p> description = DvbTextConverter.Convert(ptrDesc);[/code]</p><p></p><p>...and, replace this code in DvbTextConverter.cs:</p><p><a href="https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TvLibrary.Interfaces/DvbTextConverter.cs?source=cc#L40" target="_blank">https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TvLibrary.Interfaces/DvbTextConverter.cs?source=cc#L40</a></p><p>[code]</p><p> int encoding = CultureInfo.CurrentCulture.TextInfo.ANSICodePage;</p><p> try</p><p> {</p><p> if (string.IsNullOrEmpty(lang))</p><p> {</p><p> lang = CultureInfo.CurrentCulture.ThreeLetterISOLanguageName;</p><p> }</p><p> lang = lang.ToLowerInvariant();</p><p> if (lang == "cze" || lang == "ces")</p><p> {</p><p> encoding = 20269; //ISO-6937</p><p> }</p><p> else if (lang == "ukr" || lang == "bel" || lang == "rus")</p><p> {</p><p> encoding = 28595; //ISO-8859-5</p><p> }</p><p>[/code]</p><p></p><p>...with:</p><p>[code]</p><p> encoding = 20269; // default/base: ISO-6937</p><p>[/code]</p><p></p><p>This doesn't fix:</p><ul> <li data-xf-list-type="ul">the bugs in TsWriter</li> <li data-xf-list-type="ul">the Marshal.ReadByte(ptr, 2) bug in DvbTextConverter</li> <li data-xf-list-type="ul">the Euro sign exception</li> </ul><p>Again, I'm simply trying to understand whether it is safe/okay to remove Iso6937ToUnicode.cs. To me it looks like it should be okay, and we can just fix the Euro sign encoding manually.</p><p></p><p>Do you agree?</p><p></p><p></p><p>Completely agree. <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p></p><p></p><p>Is this necessary, or just maybe useful for dealing with bad providers? From what I can see in EN 300 468, there is no need for an ISO 639 code to tell us how to decode.</p><p></p><p></p><p>I agree. I think we can consider such a solution, but it depends how bad the problem is. If we only have a few reports of problems after fixing the code then maybe we can just ask those people to use some other EPG source. I prefer that over adding configuration because so many people say that TV Server configuration is complicated enough already... <img src="" class="smilie smilie--sprite smilie--sprite1" alt=":)" title="Smile :)" loading="lazy" data-shortname=":)" /></p><p></p><p>mm</p></blockquote><p></p>
[QUOTE="mm1352000, post: 1052608, member: 82144"] No problem. It is holiday time so I wasn't expecting an answer in a hurry. :) Heh, I'm glad I'm not going crazy. :D To be clear, what I was thinking is that handling for three byte encoding is sometimes okay and sometimes not. It completely depends on the value of the encoding byte. If the input is: 0x10 0x00 [encoding byte <=[B] [/B]0x05] [content] ...output will be: 0x10 [encoding byte <= 0x05] [encoding byte <= 0x05] [content] This situation is okay for TV library. It sees the encoding byte in the correct position. No problem. :) However, if the input is: 0x10 0x00 [encoding byte > 0x05] [content] ...output will be: 0x10 [encoding byte > 0x05] [content] The encoding byte is not repeated due to this condition: [code]else if (((c > 0x05) && (c <= 0x1F)) || ((c >= 0x80) && (c < 0x9F))) //0x1-0x5 = choose character set, must keep this byte![/code] [url]https://github.com/MediaPortal/MediaPortal-1/blob/master/DirectShowFilters/DvbCoreUtils/DvbUtil.cpp#L133[/url] In this case TV library will handle the content using the default encoding, which could be wrong. Yep, completely agree. :) Yes, I'd definitely be interested. :) I also have some questions already. Mainly I'm trying to understand if there is any reason to keep the Iso6937ToUnicode class. If we want to conform with EN 300 468, is it technically correct to replace this code in TvCardDvbBase.cs: [url]https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TVLibrary/Implementations/DVB/Graphs/TvCardDvbBase.cs#L2638[/url] [code] if (language.ToUpperInvariant() == "CZE" || language.ToUpperInvariant() == "CES") { title = Iso6937ToUnicode.Convert(ptrTitle); description = Iso6937ToUnicode.Convert(ptrDesc); } else { title = DvbTextConverter.Convert(ptrTitle, ""); description = DvbTextConverter.Convert(ptrDesc, ""); }[/code] ...simply with: [code] title = DvbTextConverter.Convert(ptrTitle); description = DvbTextConverter.Convert(ptrDesc);[/code] ...and, replace this code in DvbTextConverter.cs: [url]https://github.com/MediaPortal/MediaPortal-1/blob/master/TvEngine3/TVLibrary/TvLibrary.Interfaces/DvbTextConverter.cs?source=cc#L40[/url] [code] int encoding = CultureInfo.CurrentCulture.TextInfo.ANSICodePage; try { if (string.IsNullOrEmpty(lang)) { lang = CultureInfo.CurrentCulture.ThreeLetterISOLanguageName; } lang = lang.ToLowerInvariant(); if (lang == "cze" || lang == "ces") { encoding = 20269; //ISO-6937 } else if (lang == "ukr" || lang == "bel" || lang == "rus") { encoding = 28595; //ISO-8859-5 } [/code] ...with: [code] encoding = 20269; // default/base: ISO-6937 [/code] This doesn't fix: [LIST] [*]the bugs in TsWriter [*]the Marshal.ReadByte(ptr, 2) bug in DvbTextConverter [*]the Euro sign exception [/LIST] Again, I'm simply trying to understand whether it is safe/okay to remove Iso6937ToUnicode.cs. To me it looks like it should be okay, and we can just fix the Euro sign encoding manually. Do you agree? Completely agree. :) Is this necessary, or just maybe useful for dealing with bad providers? From what I can see in EN 300 468, there is no need for an ISO 639 code to tell us how to decode. I agree. I think we can consider such a solution, but it depends how bad the problem is. If we only have a few reports of problems after fixing the code then maybe we can just ask those people to use some other EPG source. I prefer that over adding configuration because so many people say that TV Server configuration is complicated enough already... :) mm [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
MediaPortal 1
Support
Electronic Program Guide
DVB-S EPG wrong character encoding for Hungarian language
Contact us
RSS
Top
Bottom