-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build error with Unicode #94
Comments
This is slightly surprising since the version of GHC ought to handle unicode Strings. I suppose what I'll do is change it to a (c) to fix the compilation error, and leave this issue open in case anyone is able to explain it. Thanks for mentioning. |
I had a similar issue when building on a new system where I hadn't set done my locale configuration properly. With Running |
It makes sense that my locale wasn't set properly, as I'd installed Debian via debootstrap, which only does enough configuration to get chroot working. I'll add 'set locale' to my post-install checklist next time ;) |
I think this specific case can be closed now? But there was more in-depth discussion on the mailing list, for allowing unicode in the Idris sources in the future. Something about just having Idris simply assume the sources are UTF8? |
I'm happy for it to close. |
tjice just reported something that looks rather similiar in IRC: http://codepad.org/hsRtppRm |
Doing some digging - I suspect hGetContents is being called from the |
Perhaps we could set to locale to utf8 on each file handle we open (to force all the .idr files to be read as utf8, rather than using the system locale - "The default encoding when a Handle is created is localeEncoding, namely the default encoding for the current locale." - https://hackage.haskell.org/package/base-4.7.0.0/docs/System-IO.html#g:23 |
This sounds horribly complicated. In my opinion, the right thing to do is to just define UTF-8 as the one true encoding for Idris files, and arrange for the Haskell code to always use it. |
Thinking of adding a |
Sounds reasonable if such a thing isn't already in the libraries. /David (from phone)
|
Oh - we have utf8-string as a dep in .cabal, which already has readFile. |
Replaced all usages of readFile, writeFile, hGetLine, and hPutStrLn from Prelude with versions from [System.IO.UTF8](http://hackage.haskell.org/package/utf8-string-0.3.8/docs/System-IO-UTF8.html) Fixes idris-lang#94 Tested with `export LANG=C`. Was able to load a unicode-containing .idr file, compile and run it, use interactive vim stuff like case-splitting and proof search, and could also :addproof from the repl. Technically, only the change to readFile in Idris/Chaser.hs is necessary to fix idris-lang#94. The rest are mainly for consistency. I am a little leary of changing the output encoding to unicode when the system encoding might not be that on the things that don't deal with idris code, e.g. the .c, .java, .pom, etc output files, that are then run through gcc, javac, whatever. Also, this might address the issue fixed by idris-lang#1334, and make idris-lang#1334 redunant?
When compiling I got as far as type-checking lib/Prelude/Complex.idr then got the error "hGetContents: invalid argument (invalid byte sequence)".
This happened with "cabal install idris" (version 0.9.5.1) and with a clone of commit cea7205
I changed the copyright header in that file from using a non-ASCII character to "(c)" and this made the error go away, allowing me to compile successfully. I don't know enough about Unicode handling in Haskell/Idris to stop this reoccuring, but I thought I'd raise the issue and my quick hack.
I'm running Debian unstable on an OLPC XO-1 laptop. Here are some possibly relevant numbers:
$ uname -a
Linux olpc 2.6.32-5-486 #1 Fri Dec 10 15:32:53 UTC 2010 i586 GNU/Linux
$ dpkg -l ghc | grep "ii"
ii ghc 7.4.1-4 i386 The Glasgow Haskell Compilation system
ii libghc-ansi-terminal-de 0.5.5-3+b1 i386 Simple ANSI terminal support, with Windows compatibi
ii libghc-ansi-wl-pprint-d 0.6.4-1+b1 i386 Wadler/Leijen Pretty Printer for colored ANSI termin
ii libghc-dlist-dev 0.5-3+b1 i386 Haskell library for Differences lists
ii libghc-hostname-dev 1.0-4+b1 i386 providing a cross-platform means of determining the
ii libghc-mtl-dev 2.1.1-1 i386 Haskell monad transformer library for GHC
ii libghc-quickcheck2-dev 2.4.2-1+b1 i386 Haskell automatic testing library for GHC
ii libghc-random-dev 1.0.1.1-1+b1 i386 Random number generator for Haskell
ii libghc-regex-base-dev 0.93.2-2+b2 i386 GHC library providing an API for regular expressions
ii libghc-regex-posix-dev 0.95.1-2+b1 i386 GHC library of the POSIX regex backend for regex-bas
ii libghc-smallcheck-dev 0.6-1+b1 i386 Another lightweight testing library
ii libghc-syb-dev 0.3.6.1-1 i386 Generic programming library for Haskell
ii libghc-test-framework-d 0.6-1+b1 i386 Framework for running and organising tests
ii libghc-test-framework-q 0.2.12.1-1+b1 i386 QuickCheck2 support for the test-framework package.
ii libghc-text-dev 0.11.2.0-1 i386 efficient packed Unicode text type for Haskell - GHC
ii libghc-transformers-dev 0.3.0.0-1 i386 Haskell monad transformer library
ii libghc-utf8-string-dev 0.3.7-1+b1 i386 GHC libraries for the Haskell UTF-8 library
ii libghc-x11-dev 1.5.0.1-1+b2 i386 Haskell X11 binding for GHC
ii libghc-xml-dev 1.3.12-1+b2 i386 A simple Haskell XML library - GHC libraries
ii libghc-xmonad-dev 0.10-4+b2 i386 Lightweight X11 window manager; libraries
$ ghc -v
Glasgow Haskell Compiler, Version 7.4.1, stage 2 booted by GHC version 7.4.1
Using binary package database: /usr/lib/ghc/package.conf.d/package.cache
Using binary package database: /home/chris/.ghc/i386-linux-7.4.1/package.conf.d/package.cache
hiding package text-0.11.2.0 to avoid conflict with later version text-0.11.2.3
hiding package mtl-2.1.1 to avoid conflict with later version mtl-2.1.2
wired-in package ghc-prim mapped to ghc-prim-0.2.0.0-bd29cb1ca1b712d64e00ac9207f87d0a
wired-in package integer-gmp mapped to integer-gmp-0.4.0.0-ec87c5d9609a1d46da031ef5d51c4f79
wired-in package base mapped to base-4.5.0.0-c8e7184681d410015e93df85fc49e9dd
wired-in package rts mapped to builtin_rts
wired-in package template-haskell mapped to template-haskell-2.7.0.0-fea440f2bc02cf9a412f25b6b74c4a70
wired-in package dph-seq not found.
wired-in package dph-par not found.
Hsc static flags: -static
*** Deleting temp files:
Deleting:
*** Deleting temp dirs:
Deleting:
ghc: no input files
Usage: For basic information, try the `--help' option.
$ file lib/Prelude/Complex.idr
lib/Prelude/Complex.idr: UTF-8 Unicode text
$ hexdump -C lib/Prelude/Complex.idr | head
00000000 7b 2d 0a 20 20 c2 a9 20 32 30 31 32 20 43 6f 70 |{-. .. 2012 Cop|
00000010 79 72 69 67 68 74 20 4d 65 6b 65 6f 72 20 4d 65 |yright Mekeor Me|
00000020 6c 69 72 65 0a 2d 7d 0a 0a 0a 6d 6f 64 75 6c 65 |lire.-}...module|
00000030 20 50 72 65 6c 75 64 65 2e 43 6f 6d 70 6c 65 78 | Prelude.Complex|
00000040 0a 0a 69 6d 70 6f 72 74 20 42 75 69 6c 74 69 6e |..import Builtin|
00000050 73 0a 69 6d 70 6f 72 74 20 50 72 65 6c 75 64 65 |s.import Prelude|
00000060 0a 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |..--------------|
00000070 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |----------------|
00000080 20 52 65 63 74 61 6e 67 75 6c 61 72 20 66 6f 72 | Rectangular for|
00000090 6d 20 0a 0a 69 6e 66 69 78 20 36 20 3a 2b 0a 64 |m ..infix 6 :+.d|
The text was updated successfully, but these errors were encountered: