Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py_func Crashes #47

Open
altostratous opened this issue Feb 7, 2021 · 1 comment
Open

py_func Crashes #47

altostratous opened this issue Feb 7, 2021 · 1 comment
Assignees

Comments

@altostratous
Copy link

Environment info

Operating System:
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Installed version of CUDA and cuDNN: None

(please attach the output of ls -l /path/to/cuda/lib/libcud*):

(base) ali@simon:/tmp/mozilla_ali0$ ls -l /path/to/cuda/lib/libcud*
ls: cannot access '/path/to/cuda/lib/libcud*': No such file or directory

If installed from binary pip package, provide:

  1. Which pip package you installed.
  2. The output from python -c "import tensorflow; print(tensorflow.version)".

If installed from sources, provide the commit hash: 11b3284

Steps to reproduce

  1. Instantiate ResNet50 with keras.
  2. Load TensorFI on it.
  3. Run prediction with fault injections enabled.

The code is bellow:

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
import TensorFI as fi
from tensorflow.keras.backend import get_session

model = ResNet50(weights='imagenet')

img_path = 'val_5.JPEG'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

session = get_session()

tf = fi.TensorFI(session, disableInjections=False, logLevel=50)

preds = session.run(model.outputs[0], feed_dict={model.inputs[0]: x})

# preds = model.predict(x)

# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

here is the input image used:
val_5

when I turn off the injections I get the expected output:

 ('Predicted:', [(u'n04399382', u'teddy', 0.81401235), (u'n02105641', u'Old_English_sheepdog', 0.032959767), (u'n04008634', u'projectile', 0.020169798)])

What have you tried?

  1. tracing the code which ends in some c execution and terminates by a check in py_func.cc

Logs or other output that would be helpful

(If logs are large, please upload as attachment).

/home/ali/anaconda/envs/tensorfi/bin/python /home/ali/Desktop/Code/TensorFI/resnet50/model.py
WARNING:tensorflow:From /home/ali/Desktop/Code/TensorFI/resnet50/model.py:6: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

WARNING:tensorflow:From /home/ali/anaconda/envs/tensorfi/lib/python2.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling __init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
2021-02-05 18:53:34.853793: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2021-02-05 18:53:34.881342: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2394305000 Hz
2021-02-05 18:53:34.881859: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5582bd2a2eb0 executing computations on platform Host. Devices:
2021-02-05 18:53:34.881909: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0-3
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0 thread 0 
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 0 thread 1 
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 1 thread 0 
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 1 thread 1 
OMP: Info #250: KMP_AFFINITY: pid 90837 tid 90837 thread 0 bound to OS proc set 0
2021-02-05 18:53:34.882399: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2021-02-05 18:53:35.374807: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
/home/ali/Desktop/Code/TensorFI/TensorFI/fiConfig.py:270: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  params = yaml.load(pStream)
Unable to open log file faultLogs/NoName-log
Starting log at 2021-02-05 18:53:40.952907


---------------------------------------
2021-02-05 18:53:43.067374: F tensorflow/python/lib/core/py_func.cc:466] Check failed: DataTypeCanUseMemcpy(t.dtype()) 

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
@karthikp-ubc
Copy link
Contributor

I'm able to reproduce this failure on my side. I suspect it may have to do with py_func being deprecated in TF now, but I'm not sure. It'd be helpful perhaps to determine what operator is causing this by enabling the print statements in the modifyGraph.py file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants