Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Neon support (pretty much working) #35

Merged
merged 34 commits into from
Aug 16, 2019
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
fa26882
feat: neon support (mostly broken)
sunnygleason Aug 8, 2019
9e96fab
feat: progress on neon compilation
sunnygleason Aug 9, 2019
05fd4f3
feat: partial implementation of deserializer (broken, needs movemask)
sunnygleason Aug 9, 2019
086e745
feat: re-enable tests (still broken)
sunnygleason Aug 9, 2019
b9b3900
feat: remove core_arch unused dependency
sunnygleason Aug 9, 2019
1c989cf
feat: update string parse
sunnygleason Aug 10, 2019
5926070
feat: additional tweaks, not breaks during linking
sunnygleason Aug 10, 2019
62a3f93
feat: fixing intrinsics (maybe)
sunnygleason Aug 10, 2019
bca6cba
feat: temp stub replacements for intrinsics (still broken)
sunnygleason Aug 10, 2019
620e697
feat: get code closer to simdjson (still broken)
sunnygleason Aug 13, 2019
68691ad
feat: trying to fix parse_str
sunnygleason Aug 13, 2019
be60690
fix: numberparse
sunnygleason Aug 13, 2019
e6609ff
fix: endian in flatten_bits
sunnygleason Aug 13, 2019
45cbbd4
fix: utf8 encoding (still broken but closer)
sunnygleason Aug 13, 2019
15c892b
fix: use write instead of intrinsic (for now)
sunnygleason Aug 13, 2019
e3872b5
fix: fix string parsing (mostly, thanks to @licenser)
sunnygleason Aug 13, 2019
782b0a6
fix: add_overflow should always output carry
sunnygleason Aug 14, 2019
98b3356
fix: tests PASS, improved comparison operators, thanks @Licenser :)
sunnygleason Aug 15, 2019
19ad13e
fix: update arm64 to use nightly
sunnygleason Aug 15, 2019
a97487b
fix: use rust nightly image for drone CI
sunnygleason Aug 15, 2019
faea015
fix: drone CI rustup nightly
sunnygleason Aug 15, 2019
d66d68a
feat: fix guards, use rust stdlib for bit count operations
sunnygleason Aug 15, 2019
4c51dc4
feat: address review comments, platform handling and misc
sunnygleason Aug 15, 2019
bdc82d6
feat: refactor immediate tables into loads
sunnygleason Aug 15, 2019
4fb672a
feat: improving code style and similarity across architectures
sunnygleason Aug 16, 2019
ee2bd9b
feat: update generator with neon support, conditional compilation
sunnygleason Aug 16, 2019
75511d3
fix: remove double semicolon
sunnygleason Aug 16, 2019
23191c6
feat: conditional feature enablement for neon only
sunnygleason Aug 16, 2019
d514f2d
fix: add conditional compilation for target_feature=-avx2,-sse4.2 on …
sunnygleason Aug 16, 2019
3bdc169
feat: factor arch-specific methods into separate modules (with macros…
sunnygleason Aug 16, 2019
40bb843
fix: does drone need clean?
sunnygleason Aug 16, 2019
78b0aa8
fix: utf8 error checking (maybe)
sunnygleason Aug 16, 2019
5a23b40
fix: utf8 error checking (maybe maybe)
sunnygleason Aug 16, 2019
af177f3
feat: fancy generic generator functions, thanks @Licenser
sunnygleason Aug 16, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .drone.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,5 +58,7 @@ steps:
- name: test
image: rust:1
commands:
- cargo build --verbose --all
- cargo test --verbose --all
- rustup default nightly
- rustup update
- cargo +nightly build --verbose --all
- cargo +nightly test --verbose --all
34 changes: 30 additions & 4 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,18 @@
#![deny(warnings)]

#![cfg_attr(target_feature = "neon", feature(
asm,
stdsimd,
repr_simd,
custom_inner_attributes,
aarch64_target_feature,
platform_intrinsics,
stmt_expr_attributes,
simd_ffi,
link_llvm_intrinsics
)
)]

#![cfg_attr(feature = "hints", feature(core_intrinsics))]
//! simdjson-rs is a rust port of the simejson c++ library. It follows
//! most of the design closely with a few exceptions to make it better
Expand Down Expand Up @@ -89,17 +103,25 @@ pub use crate::avx2::deser::*;
#[cfg(target_feature = "avx2")]
use crate::avx2::stage1::SIMDJSON_PADDING;

#[cfg(not(target_feature = "avx2"))]
#[cfg(all(any(target_arch = "x86", target_arch = "x86_64"), not(target_feature = "avx2")))]
mod sse42;
#[cfg(not(target_feature = "avx2"))]
#[cfg(all(any(target_arch = "x86", target_arch = "x86_64"), not(target_feature = "avx2")))]
pub use crate::sse42::deser::*;
#[cfg(not(target_feature = "avx2"))]
#[cfg(all(any(target_arch = "x86", target_arch = "x86_64"), not(target_feature = "avx2")))]
use crate::sse42::stage1::SIMDJSON_PADDING;

#[cfg(target_feature = "neon")]
mod neon;
#[cfg(target_feature = "neon")]
pub use crate::neon::deser::*;
#[cfg(target_feature = "neon")]
use crate::neon::stage1::SIMDJSON_PADDING;

mod stage2;
pub mod value;

use crate::numberparse::Number;
#[cfg(not(target_feature = "neon"))]
use std::mem;
use std::str;

Expand Down Expand Up @@ -163,7 +185,11 @@ impl<'de> Deserializer<'de> {

let counts = Deserializer::validate(input, &structural_indexes)?;

let strings = Vec::with_capacity(len + SIMDJSON_PADDING);
// Set length to allow slice access in ARM code
let mut strings = Vec::with_capacity(len + SIMDJSON_PADDING);
unsafe {
sunnygleason marked this conversation as resolved.
Show resolved Hide resolved
strings.set_len(len + SIMDJSON_PADDING);
}

Ok(Deserializer {
counts,
Expand Down
199 changes: 199 additions & 0 deletions src/neon/deser.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@

pub use crate::error::{Error, ErrorType};
pub use crate::Deserializer;
pub use crate::Result;
pub use crate::neon::stage1::*;
pub use crate::neon::utf8check::*;
pub use crate::neon::intrinsics::*;
pub use crate::stringparse::*;

impl<'de> Deserializer<'de> {
#[cfg_attr(not(feature = "no-inline"), inline(always))]
pub fn parse_str_(&mut self) -> Result<&'de str> {
// Add 1 to skip the initial "
let idx = self.iidx + 1;
let mut padding = [0u8; 32];
//let mut read: usize = 0;

// we include the terminal '"' so we know where to end
// This is safe since we check sub's lenght in the range access above and only
// create sub sliced form sub to `sub.len()`.

let src: &[u8] = unsafe { &self.input.get_unchecked(idx..) };
let mut src_i: usize = 0;
let mut len = src_i;
loop {
// store to dest unconditionally - we can overwrite the bits we don't like
// later

let (v0, v1) = if src.len() >= src_i + 32 {
// This is safe since we ensure src is at least 16 wide
#[allow(clippy::cast_ptr_alignment)]
unsafe {
(
vld1q_u8(src.get_unchecked(src_i..src_i + 16).as_ptr()),
vld1q_u8(src.get_unchecked(src_i + 16..src_i + 32).as_ptr()),
)
}
} else {
unsafe {
padding
.get_unchecked_mut(..src.len() - src_i)
.clone_from_slice(src.get_unchecked(src_i..));
// This is safe since we ensure src is at least 32 wide
(
vld1q_u8(padding.get_unchecked(0..16).as_ptr()),
vld1q_u8(padding.get_unchecked(16..32).as_ptr()),
)
}
};

let ParseStringHelper { bs_bits, quote_bits } = find_bs_bits_and_quote_bits(v0, v1);

if (bs_bits.wrapping_sub(1) & quote_bits) != 0 {
// we encountered quotes first. Move dst to point to quotes and exit
// find out where the quote is...
let quote_dist: u32 = quote_bits.trailing_zeros();

///////////////////////
// Above, check for overflow in case someone has a crazy string (>=4GB?)
// But only add the overflow check when the document itself exceeds 4GB
// Currently unneeded because we refuse to parse docs larger or equal to 4GB.
////////////////////////

// we advance the point, accounting for the fact that we have a NULl termination

len += quote_dist as usize;
unsafe {
let v = self.input.get_unchecked(idx..idx + len) as *const [u8] as *const str;
return Ok(&*v);
}

// we compare the pointers since we care if they are 'at the same spot'
// not if they are the same value
}
if (quote_bits.wrapping_sub(1) & bs_bits) != 0 {
// Move to the 'bad' character
let bs_dist: u32 = bs_bits.trailing_zeros();
len += bs_dist as usize;
src_i += bs_dist as usize;
break;
} else {
// they are the same. Since they can't co-occur, it means we encountered
// neither.
src_i += 32;
len += 32;
}
}

let mut dst_i: usize = 0;
let dst: &mut [u8] = self.strings.as_mut_slice();

loop {
let (v0, v1) = if src.len() >= src_i + 32 {
// This is safe since we ensure src is at least 16 wide
#[allow(clippy::cast_ptr_alignment)]
unsafe {
(
vld1q_u8(src.get_unchecked(src_i..src_i + 16).as_ptr()),
vld1q_u8(src.get_unchecked(src_i + 16..src_i + 32).as_ptr()),
)
}
} else {
unsafe {
padding
.get_unchecked_mut(..src.len() - src_i)
.clone_from_slice(src.get_unchecked(src_i..));
// This is safe since we ensure src is at least 32 wide
(
vld1q_u8(padding.get_unchecked(0..16).as_ptr()),
vld1q_u8(padding.get_unchecked(16..32).as_ptr()),
)
}
};

unsafe {
dst.get_unchecked_mut(dst_i..dst_i + 32).copy_from_slice(src.get_unchecked(src_i..src_i + 32));
}

// store to dest unconditionally - we can overwrite the bits we don't like
// later
let ParseStringHelper { bs_bits, quote_bits } = find_bs_bits_and_quote_bits(v0, v1);

if (bs_bits.wrapping_sub(1) & quote_bits) != 0 {
sunnygleason marked this conversation as resolved.
Show resolved Hide resolved
// we encountered quotes first. Move dst to point to quotes and exit
// find out where the quote is...
let quote_dist: u32 = quote_bits.trailing_zeros();

///////////////////////
// Above, check for overflow in case someone has a crazy string (>=4GB?)
// But only add the overflow check when the document itself exceeds 4GB
// Currently unneeded because we refuse to parse docs larger or equal to 4GB.
////////////////////////

// we advance the point, accounting for the fact that we have a NULl termination

dst_i += quote_dist as usize;
unsafe {
self.input
.get_unchecked_mut(idx + len..idx + len + dst_i)
.clone_from_slice(&self.strings.get_unchecked(..dst_i));
let v = self.input.get_unchecked(idx..idx + len + dst_i) as *const [u8]
as *const str;
self.str_offset += dst_i as usize;
return Ok(&*v);
}

// we compare the pointers since we care if they are 'at the same spot'
// not if they are the same value
}
if (quote_bits.wrapping_sub(1) & bs_bits) != 0 {
// find out where the backspace is
let bs_dist: u32 = bs_bits.trailing_zeros();
let escape_char: u8 = unsafe { *src.get_unchecked(src_i + bs_dist as usize + 1) };
// we encountered backslash first. Handle backslash
if escape_char == b'u' {
// move src/dst up to the start; they will be further adjusted
// within the unicode codepoint handling code.
src_i += bs_dist as usize;
dst_i += bs_dist as usize;
let (o, s) = if let Ok(r) = handle_unicode_codepoint(
unsafe { src.get_unchecked(src_i..) },
unsafe { dst.get_unchecked_mut(dst_i..) }
)
{
r
} else {
return Err(self.error(ErrorType::InvlaidUnicodeCodepoint));
};
if o == 0 {
return Err(self.error(ErrorType::InvlaidUnicodeCodepoint));
};
// We moved o steps forword at the destiation and 6 on the source
src_i += s;
dst_i += o;
} else {
// simple 1:1 conversion. Will eat bs_dist+2 characters in input and
// write bs_dist+1 characters to output
// note this may reach beyond the part of the buffer we've actually
// seen. I think this is ok
let escape_result: u8 =
unsafe { *ESCAPE_MAP.get_unchecked(escape_char as usize) };
if escape_result == 0 {
return Err(self.error(ErrorType::InvalidEscape));
}
unsafe {
*dst.get_unchecked_mut(dst_i + bs_dist as usize) = escape_result;
}
src_i += bs_dist as usize + 2;
dst_i += bs_dist as usize + 1;
}
} else {
// they are the same. Since they can't co-occur, it means we encountered
// neither.
src_i += 32;
dst_i += 32;
}
}
}
}
Loading