DVB-S EPG wrong character encoding for Hungarian language (2 Viewers)

Vasilich

Portal Pro
August 30, 2009
3,394
1,170
Germany, Mayence
Home Country
Russian Federation Russian Federation
next version. Not sure if it will work @ your environment as i compiled master branch with .NET switched to 3.5 instead of 4.0.
Try and tell if TVService is
1. able to start
and 2. grab EPG.
If it starts and grabs EPG, but not in correct encoding for some channels - then i have TSWriter version with extra EPG logging ready. You'll get instructed if necessarily.
 

Attachments

  • TVLibraryTest3.zip
    259 KB

gurabli

Portal Pro
July 20, 2010
242
5
Home Country
Hungary Hungary
next version. Not sure if it will work @ your environment as i compiled master branch with .NET switched to 3.5 instead of 4.0.
Try and tell if TVService is
1. able to start
and 2. grab EPG.
If it starts and grabs EPG, but not in correct encoding for some channels - then i have TSWriter version with extra EPG logging ready. You'll get instructed if necessarily.
Thanks for the file, I will test it today when I get home from work and post back the results! If I'm correct, TVLibrary.dll should be replaced in two folders (or is it .interfaces, do not remember now)?
 

Vasilich

Portal Pro
August 30, 2009
3,394
1,170
Germany, Mayence
Home Country
Russian Federation Russian Federation
for the case when encoding with TVLibrary v3 is wrong:
  1. Stop TVService
  2. Put empty file named TsWriter_EPG.log in TVServer log folder (normally c:\Users\All Users\Team MediaPortal\MediaPortal TV Server\log\)
  3. Backup TSWriter.ax from TVServer program folder (usually c:\Program Files (x86)\Team MediaPortal\MediaPortal TV Server\)
  4. Put my TSWriter.ax (attached here) on its place.
  5. Start TVService
  6. Make sure that in TVSetup - EPG - "EPG grabbing while timeshifting" is checked, and timeout there is set to 1 minute (so you don't need to wait long)
  7. Erase all EPG via pressing TVSetup - Manual control - "refresh DVB EPG".
  8. Watch the channel that has wrong encoded EPG for at least 3 minutes. Write down the time of starting of channel watch.
  9. Make screenshot of incorrectly encoded EPG for that channel (if possible - with visible diacritics).
  10. Switch to the channel that has same language as previous channel (hungarian, czech, serbian, wherever you has found wrong encoding), but proper EPG encoding, and watch that channel also 3 minutes. Write down the time of starting of channel watch.
  11. Make screenshot of correctly encoded EPG for that channel (if possible - with visible diacritics).
  12. Stop TV.
  13. Zip TsWriter.log, and post it together with screenshots and times here.
I hope this just sounds so complex, but easy to do.
But if you want to fix it (in case that tvlibrary from previous post didn't help) - i need your help, as (like already said) i tested it with german and polish channels that i have found on Astra 19.2 and HotBird 13.0, but these satellites have no FTA channels with russian, hungarian, czech, serbian or croatian EPG, so i am not able to test it by myself.
Thanks in advance.
 

Attachments

  • TsWriter_EPGLogging.zip
    140.5 KB

gurabli

Portal Pro
July 20, 2010
242
5
Home Country
Hungary Hungary
Unfortunately the ver3 files did not solve the issue, but they did work. I mean EPG was grabbed fine, and for UPC provider the characters were fine, but not for other Hungarian providers. I did the test with the tswriter you have provided, attached is the log file and the two screenshots.

Screen1: Digi sport 1 HD (provider DIGI TV RCS), wrong EPG, should be Labdarúgás
Playback started at 17:22

Screen2: Sport 1 HD (provider UPC Direct), correct EPG (with new tvlibrary files). You can see that Digi Sport 1 HD should be the same as Sport 1 HD, as both are broadcasting football atm (Hungarian: Labdarúgás)
Playback started: 17:27

I hope this will help you and we can resolve these issues! Let me know if I can help with testing further!

Thanks!
 

Attachments

  • screen1.png
    screen1.png
    511.7 KB
  • screen2.png
    screen2.png
    524 KB

Vasilich

Portal Pro
August 30, 2009
3,394
1,170
Germany, Mayence
Home Country
Russian Federation Russian Federation
@gurabli thanks for your effort
so from logs i see that UPC Direct (check channel details - it should have networkid=[0x600], transportid=[0x2C1], serviceid=[0x7725]) sets proper encoding flag for his EPG:
06-11-2013 17:27:46.550 EPG Info for channel with networkid=[0x600], transportid=[0x2C1], serviceid=[0x7725]:
06-11-2013 17:27:46.550 Event: id=[0x1977], date=[06.11.2013 (MJD=0xDD1A)], timeUTC=[12:00:00], duration=[01:15:00], running_status=[0], free_CA_mode=[1]
06-11-2013 17:27:46.550 Genre :genre=[football/soccer][0x0403]
06-11-2013 17:27:46.550 Short Descr+:lang=[cze], event=[Fotbal] / enc=[-], text=[MS do 17 let, SpojenÂe arabskÂe emirÂaty. SemifinÂale. ÏSvÂedsko - NigÂerie.] / enc=[-]
06-11-2013 17:27:46.550 Short Descr+:lang=[eng], event=[FOOTBALL] / enc=[-], text=[FIFA U-17 World Cup in United Arab Emirates. Semi-finals: Sweden - Nigeria] / enc=[-]
06-11-2013 17:27:46.550 Short Descr+:lang=[hun], event=[LabdarÂugÂas] / enc=[-], text=[17 Âev alattiak vilÂagbajnoksÂaga, EgyesÈult Arab EmÂirsÂegek, elÍodÈontÍo: SvÂedorszÂag - NigÂeria.] / enc=[-]
06-11-2013 17:27:46.550 Short Descr+:lang=[rum], event=[FOTBAL] / enc=[-], text=[FIFA U-17 World Cup in United Arab Emirates. Semi-finals: Sweden - Nigeria] / enc=[-]
enc=- means that no encoding flag specified, and according to ETSI EN 300 468 V1.13.1
If the first byte of the text field has a value in the range "0x20" to "0xFF" then this and all subsequent bytes in the text
item are coded using the default character coding table (table 00 - Latin alphabet) of figure A.1.
..
Figure A.1: Character code table 00 - Latin alphabet with Unicode equivalents
NOTE: This table is a superset of ISO/IEC 6937 with addition of the Euro symbol (U+20AC) in position 0xA4.
and these strange capital letters preceding "normal" letters (i.e. expecting to be there, but with diacritic symbols) is the way of ISO 6937 encoding.
So far - good.

Now - to DIGI TV:
06-11-2013 17:22:05.392 EPG Info for channel with networkid=[0x1], transportid=[0x5], serviceid=[0xA32]:
06-11-2013 17:22:05.392 Event: id=[0x25CE], date=[06.11.2013 (MJD=0xDD1A)], timeUTC=[16:45:00], duration=[01:00:00], running_status=[0], free_CA_mode=[1]
06-11-2013 17:22:05.392 Short Descr+:lang=[hun], event=[Labdarúgás] / enc=[-], text=[Spanyol bajnokság, 12. forduló, összefoglaló] / enc=[-]
06-11-2013 17:22:05.392 ExtendDescr*:lang=[hun], part=[1/1], event=[Labdarúgás] / enc=[-], text=[ ] / enc=[-]
and here we see clear problem: this provider doesn't set encoding flag according to DVB specifications. If no encoding specified (enc=-) then it should encode all letters with diacritic as 2 letters. And this is clearly not the case. Blame provider - they should follow DVB specs. IMHO most suitable encoding for them is either that UPCDirect uses, or ISO8859-2 (East European). I have seen 8859-2 is used by polish providers (TVP Polonia):
07-11-2013 02:00:26.677 EPG Info for channel with networkid=[0x1], transportid=[0x423], serviceid=[0x1BBE]:
07-11-2013 02:00:26.677 Event: id=[0x1AC7], date=[09.11.2013 (MJD=0xDD1D)], timeUTC=[03:00:00], duration=[01:05:00], running_status=[0], free_CA_mode=[0]
07-11-2013 02:00:26.678 Short Descr+:lang=[pol], event=[Dzieñ powszedni w Kabulu{PL}] / enc=[10.2.2], text=[] / enc=[10.2.2]

Currently i have no idea how to solve this problem with wrong marked encoding with some providers. Approach as with czech providers (force one predefined encoding for all strings, marked as [cze], in old code) cannot be applied here because there are different encodings used by different hungarian providers though all of them are marked as [hun]. If you will get any idea - i'd be glad to hear and implement it :)
 
Last edited:

gurabli

Portal Pro
July 20, 2010
242
5
Home Country
Hungary Hungary
Many thanks Vasilich for checking all this, your effort and help are very much appreciated!

Just a note: it seams that MePo handles DIGI TV and other provider's EPG information correctly with default install. If I'm correct, EPG for DIGI TV provider is correct without your patched tvlibrary file. When your patched tvlibrary files is used, then EPG for UPC Direct is correct, but DigiTV and other providers are wrong. I know it doesn't help a lot in our case as both providers are marked as HUN, but perhaps we could look at the Serbian channels, where diacritics are also wrong, and I'm sure the language information is marked as Serbia there. Perhaps we could check this?

I will make a new log and screenshot for a Serbian channel, maybe you can see there how to fix EPG there.

I will also need to check if with the default tvlibrary DIGITV epg is correct or not (I do not remember, only that UPC is not correct for sure).

I do not have any ideas except these. Maybe somehow to look at DVBViewer, as there all the characters are fine and correct for all the providers, using the same DVB streams. This means that it can be solved somehow, but I really do not know how.

Will be back at evening with the logs!
 

Vasilich

Portal Pro
August 30, 2009
3,394
1,170
Germany, Mayence
Home Country
Russian Federation Russian Federation
i can say how it works in stock MP:
1. in one place, TvCardDvbBase, czech language will be checked and forced to use ISO 6937, ignoring encoding flags:
Code:
if (language.ToUpperInvariant() == "CZE" || language.ToUpperInvariant() == "CES")
                    {
                      title = Iso6937ToUnicode.Convert(ptrTitle);
                      description = Iso6937ToUnicode.Convert(ptrDesc);
                    }
2. at another place, in DvbTextConverter, current locale will be checked for czech (again!) and russian/belorussian/ukrainian, for both cases forcing predefined encoding if no encoding flag found in text (and, thus, ISO 6937 should be used):
Code:
        lang = lang.ToLowerInvariant();
        if (lang == "cze" || lang == "ces")
        {
          encoding = 20269; //ISO-6937
        }
        else if (lang == "ukr" || lang == "bel" || lang == "rus")
        {
          encoding = 28595; //ISO-8859-5
        }
What currently absents in stock MP is the proper (i.e. according to specs) handling of empty encoding flags.
Specs tell that if no encoding byte specified - then use superset of ISO 6937.
MP, if encoding byte isn't there, takes default for TVService user ANSI codepage :
Code:
encoding = CultureInfo.CurrentCulture.TextInfo.ANSICodePage;
MP approach can work better for providers that don't follow DVB specs, and can result in wrong encoding for those broadcasters that use correct default encoding ISO 6937 (e.g. UPC Direct).

I have one idea: when no encoding specified, i can first check if given string has 6937 encoding. I will check if there are any non-specified in 6937 symbols or symbol combinations present in given string, and if yes - then we have bad provider, and try to take second option - convert it according to current ANSI codepage. Seems correct?
 

regeszter

Retired Team Member
  • Premium Supporter
  • October 29, 2005
    5,335
    4,954
    Home Country
    Hungary Hungary
    @Vasilich

    Can you summarize for me which provider how encode the epg in hungary?
     

    Vasilich

    Portal Pro
    August 30, 2009
    3,394
    1,170
    Germany, Mayence
    Home Country
    Russian Federation Russian Federation
    @regeszter
    i can only rely on the logs provided by @gurabli , and from those i can see
    06-11-2013 17:27:46.550 EPG Info for channel with networkid=[0x600], transportid=[0x2C1], serviceid=[0x7725]:
    , so acc. to KingOfSat this is channel "Eurosport HD" on Thor 0.8W 12034.0V, so provider is UPC Direct. No encoding flags, and acc. to DVB specs, ISO6937 encoding. All is correct.
    06-11-2013 17:22:05.392 EPG Info for channel with networkid=[0x1], transportid=[0x5], serviceid=[0xA32]:
    "Sport 1 Hungary" on Thor 0.8W 12563H, provider - RCS DigiTV. No encoding flags, but not ISO6937 encoding (wrong! not acc. to DVB) - fast check shows that it is ISO 8859-2, though not sure. It should have then 3 marker bytes in the beginning of each string - 0x10, 0x00, 0x02 (or even 0x10 0xxx 0x02 as second byte is unimportant and MP ignores its value, see my prev. post with polish provider). Note that it is not even Windows-1250, that AFAIK is used as default ANSI for hungarian - pls correct me if i am wrong.

    If you are able to receive more providers - then use my patched TSWriter (follow the instructions here https://forum.team-mediaportal.com/...hungarian-language.122135/page-4#post-1038420 ) and post log - then we will see what encoding is in use, and if it follows DVB specs.
     
    Last edited:

    Users who are viewing this thread

    Top Bottom