/* BIG5 19782 1st 81 FE total = 126 2nd 40 7E 63 A1 FE 94 total = 157 USR3 8140 8DFE 2041 USR2 8E40 A0FE 2983 SPC1 A140 A3BF 408 CTRL A3C0 A3E0 33 ???? A3E1 A3FE 30 STD1 A440 C67E 5401 SPC2 C6A1 C8FE 408 STD2 C940 F9D5 7652 ETEN F9D6 F9FE 41 USR1 FA40 FEFE 785 */ #include #include #include typedef unsigned char byte; typedef unsigned short word; typedef unsigned long dword; typedef union { word W; struct { byte L, H; } B; } WB; const char* crlf = "\r\n"; word hex_atoi( char *s ) { word rc = 0; byte ch; while( *s ) { ch = toupper( *s ); s++; if( ( ch >= 'A' ) && ( ch <= 'F' ) ) { rc <<= 4; rc += ch - 'A' + 0x0A; } else if( ( ch >= '0' ) && ( ch <= '9' ) ) { rc <<= 4; rc += ch - '0'; } else if( ch != '.' ) break; } return rc; } // wkliang: 20100325 - org: B5PG = 0x81 #define B5PG 0xA1 int b5_cod2ord( byte c1, byte c2 ) { if( ( c1 < B5PG ) || ( c1 > 0xFE ) ) return -1; c1 = c1 - B5PG; // 126 if( ( c2 >= 0x40 ) && ( c2 <= 0x7E ) ) c2 = c2 - 0x40; // 63 else if( ( c2 >= 0xA1 ) && ( c2 <= 0xFE ) ) c2 = c2 - 0xA1 + 63; // 94 else return -1; return( c1 * 157 + c2 ); } void b5_ord2cod( int ord, byte *p1, byte *p2 ) { *p1 = ord / 157 + B5PG; *p2 = ord % 157; if( *p2 < 63 ) *p2 += 0x40; else *p2 += -63 + 0xA1; return ; } int b5tbl_out( FILE* fp, int ord, int end, int esc ) { byte c1, c2; int i; b5_ord2cod( ord, &c1, &c2 ); fprintf( stderr, "b5tbl_out fm %d:%02X%02X - ", ord, c1, c2 ); b5_ord2cod( end, &c1, &c2 ); fprintf( stderr, "to %d:%02X%02X - ", end, c1, c2 ); for( i = 0; ord <= end; i++, ord++ ) { if ( esc >= 0 ) b5_ord2cod( esc, &c1, &c2 ); else b5_ord2cod( ord, &c1, &c2 ); fwrite( &c1, 1, 1, fp ); fwrite( &c2, 1, 1, fp ); // printf( "%5d:%02X%02X\r", i, c1, c2 ); // fflush( stdout ); // sleep( 1 ); // if( ( i % 38 ) == 37 ) // fwrite( crlf, 1, 2, fp ); } // fwrite( crlf, 1, 2, fp ); fprintf( stderr, "generate: %d.\n", i ); return i; } int b5tbl_gen( int argc, char *argv[] ) { FILE *fp; int cnt = 0; if( argc < 2 ) { fprintf( stderr, "%s outfile\n", argv[0] ); return EXIT_FAILURE; } if( ( fp = fopen( argv[1], "wb+" ) ) == NULL ) { perror( argv[1] ); return EXIT_FAILURE; } // SPC1 cnt += b5tbl_out( fp, b5_cod2ord(0xA1,0x40), b5_cod2ord(0xA3,0xBF), -1 ); // CTRL + ???? should be escaped cnt += b5tbl_out( fp, b5_cod2ord(0xA3,0xC0), b5_cod2ord(0xA3,0xE0), 0 ); cnt += b5tbl_out( fp, b5_cod2ord(0xA3,0xE1), b5_cod2ord(0xA3,0xFE), 0 ); // STD1 + SPC2 + STD2 + ETEN cnt += b5tbl_out( fp, b5_cod2ord(0xA4,0x40), b5_cod2ord(0xC6,0x7E), -1 ); cnt += b5tbl_out( fp, b5_cod2ord(0xC6,0xA1), b5_cod2ord(0xC8,0xFE), -1 ); cnt += b5tbl_out( fp, b5_cod2ord(0xC9,0x40), b5_cod2ord(0xF9,0xD5), -1 ); cnt += b5tbl_out( fp, b5_cod2ord(0xF9,0xD6), b5_cod2ord(0xF9,0xFE), -1 ); fclose( fp ); fprintf( stderr, "total = %d\n", cnt ); return EXIT_SUCCESS; } /* * iconv -f big5 -t unicode infile >outfile */ int b52uni_load( char* fname, WB** plst ) { FILE *fp; int i, rc, num; WB* list; if( ( fp = fopen( fname, "rb" ) ) == NULL ) { perror( fname ); return -1; } fseek( fp, 0, SEEK_END ); num = ftell( fp ) / 2; if( (*plst = (WB *)malloc( num * 2 )) == NULL ) { return -1; } fseek( fp, 2, SEEK_SET ); /* skip leading two bytes */ rc = fread( (byte*)*plst, 2, num, fp ); fclose( fp ); return rc; } WB* wlist; int wcnt; int b52uni( char* istr, char* ostr, int ocnt ) { int idx; while( *istr ) { idx = b5_cod2ord( istr[0], istr[1] ); if( idx < 0 ) { if( --ocnt < 0 ) break; *ostr++ = *istr++; if( --ocnt < 0 ) break; *ostr++ = 0x00; } else { if( idx > wcnt ) idx = 0; if( --ocnt < 0 ) break; // *ostr++ = ((byte *)&(wlist[idx]))[0]; *ostr++ = wlist[idx].B.L; if( --ocnt < 0 ) break; // *ostr++ = ((byte *)&(wlist[idx]))[1]; *ostr++ = wlist[idx].B.H; istr += 2; } } /* if( --ocnt < 0 ) return; *ostr++ = 0x00; if( --ocnt < 0 ) return; *ostr++ = 0x00; */ return ocnt; } int b52uni_test( char* fname ) { FILE* fp; char ibuf[ BUFSIZ ], obuf[ BUFSIZ * 2]; int rc; if( ( fp = fopen( fname, "r" ) ) == NULL ) { perror( fname ); return __LINE__; } // unicode file leading code obuf[0] = 0xFF; obuf[1] = 0xFE; fwrite( obuf, 1, 2, stdout ); while( fgets(ibuf, BUFSIZ, fp) != NULL ) { rc = b52uni( ibuf, obuf, sizeof(obuf) ); fwrite( obuf, 1, sizeof(obuf)-rc, stdout ); } fclose( fp ); return 0; } int main( int argc, char *argv[] ) { return b5tbl_gen( argc, argv ); if( argc < 2 ) { fprintf( stderr, "%s textfile\n", argv[0] ); return EXIT_FAILURE; } wcnt = b52uni_load( "b52uni.tab", &wlist ); fprintf( stderr, "b52uni_load %s = %d.\n", "b52uni.tab", wcnt ); b52uni_test( argv[1] ); return EXIT_SUCCESS; }