-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Japanese Windows 2byte code #880
Comments
This problem will be difficult for me to fix since I won't be able to test the correctness of my changes. Can you install a Perl interpreter (eg https://strawberryperl.com/) on your computer? If yes, I can send you my code changes and you can test for me. |
Thank you responce, I have installed that already. |
eg "cloc --encoding shiftjis" to properly handle Japanese file paths on Windows
Let's tackle this in steps. At the moment cloc still does / -> \ and uses the upper_lower_map. However I added a new option to take an arbitrary encoding so this Also, if it is possible, please see if the original cloc works from the Windows Subsystem for Linux on your computer. On my Linux computer there are no problems with Japanese file names or directories: » find . -type f ./能.c ./日本語/能率.c » cloc --by-file . 2 text files. 2 unique files. 0 files ignored. github.com/AlDanial/cloc v 2.03 T=0.00 s (443.8 files/s, 1109.4 lines/s) ---------------------------------------------------------------------------------- File blank comment code ---------------------------------------------------------------------------------- ./日本語/能率.c 0 0 2 ./能.c 1 0 2 ---------------------------------------------------------------------------------- SUM: 1 0 4 ---------------------------------------------------------------------------------- |
Thank you for updateing. |
I have tested with --encodeing shiftjis doble byte char file path has special value 0x80 to 0xff insted code of $file =~ s{\}{/}g if $ON_WINDOWS;$file =~ s{ I did debug , but not finished. |
Let's start with the basics. Save this small program to your Windows computer and let me know if it can read the problematic files you mentioned above:
|
能率.csv
**Describe the bug**
A concise description of the problem you found with cloc.
Using with Japanese Windows 2byte code file name.
cloc; OS; OS version
To Reproduce
Steps one can follow reproduce the behavior you're seeing.
cloc.exe 能率.c
cloc.exe 日本語\source
included japanese 2byte code in path
Expected result
A concise description of what you expected to happen.
The cloc command can not find source files.
Additional context
Add any other context about the problem here.
Hello
The cloc command is very useful.
On Japanese Windows, there seems to be a problem if the path name is Japanese double byte code.
Do not convert between uppercase and lowercase characters. Do not convert between backslashes and slashes.
The <: encoding (shiftjis)
#ex: open_file("<:encoding(shiftjis)", $file, 1);
Japanese windows use Shift-JIS code for file path
#import use Encode qw(encode decode);
#ex. open mode: open(my $fh,"<:encoding(shiftjis)", $file);
#do not this# $file =~ s{\}{/}g if $ON_WINDOWS;
#this is not good# $upper_lower_map{$lc} = $file;
Please fix it.
Thank you.
Koji Furukawa
The text was updated successfully, but these errors were encountered: