no, it is not possible, as ISO6937 clearly specifies byte order, not assuming that both byte orders can be used, and .NET implementation just does it wrong.I was wondering if it might be possible to specify the endianess (byte order) for decoding by using a BOM
taken from http://en.wikipedia.org/wiki/ISO/IEC_6937#Two_byte_charactersThe characters which are not represented in the primary set are coded on two bytes. The first byte the "non spacing diacritical mark" is followed by a letter from the base set e.g.:
small e with acute accent (é) = [Acute]+e
So here is the code for testing. It uses prepared text file with almost all two-byte chars defined in ISO 6937. The code for converting isn't the latest, but pretty close to it. Use it to see the wrong decoding with 20269.
yes, i also have fixed it in my code for tests1. Bug: decoder used for ISO/IEC 8859-13 and 8859-15 (3 byte encoding) is wrong.
https://github.com/MediaPortal/Medi...TvLibrary.Interfaces/DvbTextConverter.cs#L135
Should be 28603 and 28605, not 28591.
the difference isn't so big (2014 - EmDash "—", 2015 - horisontal bar "―"), but yes, you are correct, there should be 2015. My miss - this was almost completely copied from our existing Iso6937ToUnicode.cs from TVE32. Bug (?): according to EN 300 468 annex A figure A1 character conversion for ISO/IEC 6937-1 character 0xd0 should be 0x2015 not 0x2014:
https://github.com/MediaPortal/Medi...mentations/DVB/Graphs/Iso6937ToUnicode.cs#L86
yes, i checked also by IEC 6937 r2001 - it states "14/02 = LATIN CAPITAL LETTER D WITH STROKE", and "Đ" has code 0110. The same failure - I copied it from our existing tve3 code.3. Bug (?): according to EN 300 468 annex A figure A1 character conversion for ISO/IEC 6937-1 character 0xe2 should be 0x0110 not 0xd0:
https://github.com/MediaPortal/Medi...entations/DVB/Graphs/Iso6937ToUnicode.cs#L128
i believe that in ETSI they didn't want to put all 332 chars covered in IEC 6937 into their document, so they put a remark "light pink non-spacing symbols (diacritical marks)", mentioning that these symbols put these diacritical signs to the following letter. THis is not very precise expression, but from what i have seen in logs from gurabli the encoding "characted table 00" in ETSI is the IEC 6937 with one extra char €.The comment under EN 300 468 annex A table A1 says "This table is a superset of ISO/IEC 6937..." but it only shows single byte characters
no, these are source files.
I still can't resolve problems with updating my local patched GIT repository so sorry that i cannot supply you with patched files for 1.6 Will do so as soon as i get my local repo updated.
maybe @mm1352000 can tell us if he already fixed code in tve35 getString468A in TsWriter DvbUtil.cpp - then we need to patch this part also