Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing required field PrepID #351

Open
llayer opened this issue Feb 8, 2019 · 18 comments
Open

Missing required field PrepID #351

llayer opened this issue Feb 8, 2019 · 18 comments

Comments

@llayer
Copy link

llayer commented Feb 8, 2019

Hi,

I have a problem to run myspark from lxplus.
I work with Jean-Roch and I need to access some error logs.
I tried to reproduce the example in https://github.com/dmwm/WMArchive/wiki/How-to-find-records-on-HDFS-using-pyspark

I logged in from an lxplus node following https://hadoop-user-guide.web.cern.ch/hadoop-user-guide/getstart/client_cvmfs.html, since it is not possible anymore to log in via ssh analytix

However when I run the the example in the twiki:
{"spec":{"task":"/amaltaro_StepChain_ReDigi3_HG1612_WMArchive_161130_192654_9283/DIGI","timerange":[20161130,20161202]}, "fields":[]}

myspark --spec=cond.spec --script=RecordFinder --records-output=records.json

I get an error message that tells me that the PrepID is missing:
I attach the output of myspark below.

It would be great if you could help me to solve this problem.

Many, many thanks in advance,
Best Lukas

[llayer@lxplus130 WMArchive]$ myspark --spec=cond.spec --script=RecordFinder --records-output=records.json
Using spark 2.X
Enable CVMFS ...
Ivy Default Cache set to: /afs/cern.ch/user/l/llayer/.ivy2/cache
The jars for the packages stored in: /afs/cern.ch/user/l/llayer/.ivy2/jars
:: loading settings :: url = jar:file:/cvmfs/sft.cern.ch/lcg/releases/spark/2.2.1-3206b/x86_64-centos7-gcc62-opt/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-avro_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
        confs: [default]
        found com.databricks#spark-avro_2.11;4.0.0 in central
        found org.slf4j#slf4j-api;1.7.5 in central
        found org.apache.avro#avro;1.7.6 in central
        found org.codehaus.jackson#jackson-core-asl;1.9.13 in central
        found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in central
        found com.thoughtworks.paranamer#paranamer;2.3 in central
        found org.xerial.snappy#snappy-java;1.0.5 in central
        found org.apache.commons#commons-compress;1.4.1 in central
        found org.tukaani#xz;1.0 in central
:: resolution report :: resolve 963ms :: artifacts dl 38ms
        :: modules in use:
        com.databricks#spark-avro_2.11;4.0.0 from central in [default]
        com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
        org.apache.avro#avro;1.7.6 from central in [default]
        org.apache.commons#commons-compress;1.4.1 from central in [default]
        org.codehaus.jackson#jackson-core-asl;1.9.13 from central in [default]
        org.codehaus.jackson#jackson-mapper-asl;1.9.13 from central in [default]
        org.slf4j#slf4j-api;1.7.5 from central in [default]
        org.tukaani#xz;1.0 from central in [default]
        org.xerial.snappy#snappy-java;1.0.5 from central in [default]
        :: evicted modules:
        org.slf4j#slf4j-api;1.6.4 by [org.slf4j#slf4j-api;1.7.5] in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   10  |   0   |   0   |   1   ||   9   |   0   |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
        confs: [default]
        0 artifacts copied, 9 already retrieved (0kB/25ms)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/cvmfs/sft.cern.ch/lcg/releases/spark/2.2.1-3206b/x86_64-centos7-gcc62-opt/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/cvmfs/sft.cern.ch/lcg/releases/hadoop/2.7.3-b1221/x86_64-centos7-gcc62-opt/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/02/08 09:11:08 INFO SparkContext: Running Spark version 2.2.1
19/02/08 09:11:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/02/08 09:11:09 INFO SparkContext: Submitted application: AvroKeyInputFormat
19/02/08 09:11:09 INFO SecurityManager: Changing view acls to: llayer
19/02/08 09:11:09 INFO SecurityManager: Changing modify acls to: llayer
19/02/08 09:11:09 INFO SecurityManager: Changing view acls groups to: 
19/02/08 09:11:09 INFO SecurityManager: Changing modify acls groups to: 
19/02/08 09:11:09 INFO SecurityManager: SecurityManager: authentication enabled; ui acls disabled; users  with view permissions: Set(llayer); groups with view permissions: Set(); users  with modify permissions: Set(llayer); groups with modify permissions: Set()
19/02/08 09:11:09 INFO Utils: Successfully started service 'sparkDriver' on port 5001.
19/02/08 09:11:09 INFO SparkEnv: Registering MapOutputTracker
19/02/08 09:11:09 INFO SparkEnv: Registering BlockManagerMaster
19/02/08 09:11:09 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
19/02/08 09:11:09 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
19/02/08 09:11:09 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-71770f69-9df2-48f0-96b8-541540e19d03
19/02/08 09:11:09 INFO MemoryStore: MemoryStore started with capacity 2004.6 MB
19/02/08 09:11:09 INFO SparkEnv: Registering OutputCommitCoordinator
19/02/08 09:11:09 INFO log: Logging initialized @11312ms
19/02/08 09:11:09 INFO Server: jetty-9.3.z-SNAPSHOT
19/02/08 09:11:09 INFO Server: Started @11410ms
19/02/08 09:11:09 INFO AbstractConnector: Started ServerConnector@220b5902{HTTP/1.1,[http/1.1]}{0.0.0.0:5201}
19/02/08 09:11:09 INFO Utils: Successfully started service 'SparkUI' on port 5201.
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1ea5f333{/jobs,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4c756e2b{/jobs/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@9cf0917{/jobs/job,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5c0adfa3{/jobs/job/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@154de6bd{/stages,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@59dff77e{/stages/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69ffa2df{/stages/stage,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@63fc12aa{/stages/stage/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@56bcafc0{/stages/pool,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2fd3a3e0{/stages/pool/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6a0d5620{/storage,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@38c6051f{/storage/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3db721{/storage/rdd,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@41726ede{/storage/rdd/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@26fada3e{/environment,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1ef26daf{/environment/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3aa8db11{/executors,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5c06ad81{/executors/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@74b7f684{/executors/threadDump,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6db8becd{/executors/threadDump/json,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@703bb05b{/static,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@346b2582{/,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@48511a93{/api,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4f021b90{/jobs/job/kill,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@27a69ce6{/stages/stage/kill,null,AVAILABLE,@Spark}
19/02/08 09:11:09 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://188.184.30.74:5201
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/v/valya/public/spark/spark-examples-1.6.0-cdh5.15.1-hadoop2.6.0-cdh5.15.1.jar at spark://188.184.30.74:5001/jars/spark-examples-1.6.0-cdh5.15.1-hadoop2.6.0-cdh5.15.1.jar with timestamp 1549613469967
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar at spark://188.184.30.74:5001/jars/com.databricks_spark-avro_2.11-4.0.0.jar with timestamp 1549613469968
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar at spark://188.184.30.74:5001/jars/org.slf4j_slf4j-api-1.7.5.jar with timestamp 1549613469968
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.avro_avro-1.7.6.jar at spark://188.184.30.74:5001/jars/org.apache.avro_avro-1.7.6.jar with timestamp 1549613469969
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar at spark://188.184.30.74:5001/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar with timestamp 1549613469969
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar at spark://188.184.30.74:5001/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar with timestamp 1549613469969
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar at spark://188.184.30.74:5001/jars/com.thoughtworks.paranamer_paranamer-2.3.jar with timestamp 1549613469969
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar at spark://188.184.30.74:5001/jars/org.xerial.snappy_snappy-java-1.0.5.jar with timestamp 1549613469969
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar at spark://188.184.30.74:5001/jars/org.apache.commons_commons-compress-1.4.1.jar with timestamp 1549613469970
19/02/08 09:11:09 INFO SparkContext: Added JAR file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.tukaani_xz-1.0.jar at spark://188.184.30.74:5001/jars/org.tukaani_xz-1.0.jar with timestamp 1549613469970
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py at file:/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py with timestamp 1549613470276
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/myspark.py
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar with timestamp 1549613470304
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/com.databricks_spark-avro_2.11-4.0.0.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar with timestamp 1549613470313
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.slf4j_slf4j-api-1.7.5.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.avro_avro-1.7.6.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.avro_avro-1.7.6.jar with timestamp 1549613470323
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.avro_avro-1.7.6.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.apache.avro_avro-1.7.6.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar with timestamp 1549613470336
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.codehaus.jackson_jackson-core-asl-1.9.13.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar with timestamp 1549613470347
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar with timestamp 1549613470366
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/com.thoughtworks.paranamer_paranamer-2.3.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar with timestamp 1549613470376
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.xerial.snappy_snappy-java-1.0.5.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar with timestamp 1549613470426
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.apache.commons_commons-compress-1.4.1.jar
19/02/08 09:11:10 INFO SparkContext: Added file file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.tukaani_xz-1.0.jar at file:/afs/cern.ch/user/l/llayer/.ivy2/jars/org.tukaani_xz-1.0.jar with timestamp 1549613470436
19/02/08 09:11:10 INFO Utils: Copying /afs/cern.ch/user/l/llayer/.ivy2/jars/org.tukaani_xz-1.0.jar to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/org.tukaani_xz-1.0.jar
19/02/08 09:11:10 INFO Executor: Starting executor ID driver on host localhost
19/02/08 09:11:10 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 5101.
19/02/08 09:11:10 INFO NettyBlockTransferService: Server created on 188.184.30.74:5101
19/02/08 09:11:10 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/02/08 09:11:10 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 188.184.30.74, 5101, None)
19/02/08 09:11:10 INFO BlockManagerMasterEndpoint: Registering block manager 188.184.30.74:5101 with 2004.6 MB RAM, BlockManagerId(driver, 188.184.30.74, 5101, None)
19/02/08 09:11:10 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 188.184.30.74, 5101, None)
19/02/08 09:11:10 INFO BlockManager: external shuffle service port = 7337
19/02/08 09:11:10 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 188.184.30.74, 5101, None)
19/02/08 09:11:10 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3a8ae0e6{/metrics/json,null,AVAILABLE,@Spark}
19/02/08 09:11:10 WARN DFSUtil: Namenode for hadalytic remains unresolved for ID p01001532067275.cern.ch.  Check your hdfs-site.xml file to ensure namenodes are configured properly.
19/02/08 09:11:11 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
19/02/08 09:11:11 INFO EventLoggingListener: Logging events to hdfs:///user/spark/applicationHistory/local-1549613470488
19/02/08 09:11:11 INFO SparkContext: Added file /afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/PySpark/RecordFinder.py at file:/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/PySpark/RecordFinder.py with timestamp 1549613471955
19/02/08 09:11:11 INFO Utils: Copying /afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/PySpark/RecordFinder.py to /tmp/spark-4dc0c089-9e65-4cf0-a8d4-f54426f5dac2/userFiles-3ab4c76a-4051-491f-8b02-b84f7a31e719/RecordFinder.py
19/02/08 09:11:14 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
org.apache.avro.AvroTypeException: Found ns12.name12, expecting ns12.name12, missing required field PrepID
        at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130)
        at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:176)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
        at org.apache.avro.mapreduce.AvroRecordReaderBase.nextKeyValue(AvroRecordReaderBase.java:118)
        at org.apache.avro.mapreduce.AvroKeyRecordReader.nextKeyValue(AvroKeyRecordReader.java:53)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:207)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:389)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
        at scala.collection.AbstractIterator.to(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
19/02/08 09:11:14 ERROR TaskSetManager: Task 0 in stage 1.0 failed 1 times; aborting job
Traceback (most recent call last):
  File "/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py", line 455, in <module>
    main()
  File "/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py", line 415, in main
    results = run(opts.schema, hdir, opts.script, opts.spec, verbose, opts.rout, opts.yarn)
  File "/afs/cern.ch/user/l/llayer/test_wm/WMArchive/src/python/WMArchive/Tools/myspark.py", line 275, in run
    avro_rdd = ctx.union([ctx.newAPIHadoopFile(f, aformat, akey, awrite, aconv, conf=conf) for f in data_path])
  File "/cvmfs/sft.cern.ch/lcg/releases/spark/2.2.1-3206b/x86_64-centos7-gcc62-opt/python/lib/pyspark.zip/pyspark/context.py", line 675, in newAPIHadoopFile
  File "/cvmfs/sft.cern.ch/lcg/releases/spark/2.2.1-3206b/x86_64-centos7-gcc62-opt/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/cvmfs/sft.cern.ch/lcg/releases/spark/2.2.1-3206b/x86_64-centos7-gcc62-opt/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost, executor driver): org.apache.avro.AvroTypeException: Found ns12.name12, expecting ns12.name12, missing required field PrepID
        at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130)
        at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:176)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
        at org.apache.avro.mapreduce.AvroRecordReaderBase.nextKeyValue(AvroRecordReaderBase.java:118)
        at org.apache.avro.mapreduce.AvroKeyRecordReader.nextKeyValue(AvroKeyRecordReader.java:53)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:207)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:389)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
        at scala.collection.AbstractIterator.to(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1517)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1505)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1504)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1504)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1732)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1687)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1676)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
        at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
        at org.apache.spark.rdd.RDD.take(RDD.scala:1327)
        at org.apache.spark.api.python.SerDeUtil$.pairRDDToPython(SerDeUtil.scala:203)
        at org.apache.spark.api.python.PythonRDD$.newAPIHadoopFile(PythonRDD.scala:570)
        at org.apache.spark.api.python.PythonRDD.newAPIHadoopFile(PythonRDD.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.avro.AvroTypeException: Found ns12.name12, expecting ns12.name12, missing required field PrepID
        at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130)
        at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:176)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
        at org.apache.avro.mapreduce.AvroRecordReaderBase.nextKeyValue(AvroRecordReaderBase.java:118)
        at org.apache.avro.mapreduce.AvroKeyRecordReader.nextKeyValue(AvroKeyRecordReader.java:53)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:207)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:389)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
        at scala.collection.AbstractIterator.to(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        ... 1 more
@vkuznet
Copy link
Contributor

vkuznet commented Feb 18, 2019

Lukas, sorry for delay in reply, I was traveling and sick. I can reproduce this problem, but since the task name and data range you used is quite old I bet it is issue with a schema of data which has been migrated. Can you try another, new-ish, task name and data range?

@vlimant
Copy link

vlimant commented Feb 19, 2019

Hi @vkuznet, do you have a bit of bandwidth to help Lukas forward ?

@vkuznet
Copy link
Contributor

vkuznet commented Feb 19, 2019 via email

@vlimant
Copy link

vlimant commented Feb 19, 2019

makes sense, if prepid is explicitely requested, or searched about.
which is it @llayer ? do you retreive the prepid explicitely, or is the query containing prepid ?

@vkuznet
Copy link
Contributor

vkuznet commented Feb 19, 2019 via email

@llayer
Copy link
Author

llayer commented Feb 22, 2019

Hi Valentin, Jean-Roch,

I finally managed to test and run a new task for a more recent production, e.g. I made a
spec to find a lfn file that i recently used in my analysis:

"spec":{"lfn":"/store/mc/RunIISummer16NanoAODv4/TT_TuneCUETP8M2T4_13TeV-powheg-pythia8/NANOAODSIM/PUMoriond17_Nano14Dec2018_102X_mcRun2_asymptotic_v6-v1/90000/FCA7190F-5031-DA4F-8961-D0351E0D3FEB.root","timerange":[20190129,20190131]}, "fields":[]} 

This runs and I get an output file as described in the wiki.
Next I tried to follow the instructions to figure out the log files, as described here:
https://github.com/dmwm/WMArchive/wiki/How-to-find-logArchive-logCollect-location-for-given-LFN
and https://github.com/dmwm/WMArchive/wiki/Steps-for-Querying-LogCollect-file-Location

I tried to lookup the log file for one unmerged task that was returned, i.e. I made a spec:

{"spec":{"lfn":"/store/unmerged/RunIISummer16NanoAODv4/TT_TuneCUETP8M2T4_13TeV-powheg-pythia8/NANOAODSIM/PUMoriond17_Nano14Dec2018_102X_mcRun2_asymptotic_v6-v1/90000/121C44D2-DD1F-3F4D-BCE8-A518079C2E49.root","timerange":[20190129,20190131]}, "fields":[]}

and then run

myspark --script=LogFinder --spec=cond_tt_unmerged.spec --yarn

This returned:

### number of results 1                                                         
{"fields": [], "spec": {"query": ["/store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz"], "timerange": [20190129, 20190131]}, "queries": ["/store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz"]}

So finally this should be the location of the logfile and I should be able to retrieve it with xrtcp,
however this last step now fails for me:

xrdcp root://cms-xrd-global.cern.ch//store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz
[0B/0B][100%][==================================================][0B/s]  
Run: [ERROR] Server responded with an error: [3011] No servers are available to read the file.

Do you have any idea what might be the issue here?
Many, many thanks in advance!

@vkuznet
Copy link
Contributor

vkuznet commented Feb 22, 2019

Lukas, I'm glad that WMArchive procedure to find records is working now, but I doubt I can help here with xrdcp problem. The storage of log files is beyond WMArchive responsibility. You may ask Alan (@amaltaro ) what is a policy to keep log archive files, but my understanding that they may be available on EOS.

@llayer
Copy link
Author

llayer commented Feb 22, 2019

Thank you Valentin, I will contact him!

@llayer
Copy link
Author

llayer commented Feb 22, 2019

Dear Alan @amaltaro,

for my service task I need to access and download a large number of log files.
I finally managed to locate the files with the WMArchive, but now I have an issue to
access the files with xrdcp:

xrdcp root://cms-xrd-global.cern.ch//store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz
[0B/0B][100%][==================================================][0B/s]  
Run: [ERROR] Server responded with an error: [3011] No servers are available to read the file.

Is there an easy way to download the logs once I have the location?

Many thanks!!

@amaltaro
Copy link

Hi Lucas,
yes, there is an easy to download them within ~2 months of lifetime, it's through EOS over HTTP, problem is that we didn't have enough quota until beginning of Feb and all attempts to upload logs there failed since beginning of Jan.

Second option would be to fetch these from the site the workflow is running at. However, those are transient files and they are collected by a LogCollect job, which makes a bigger tarball of logArchives and transfer it to CERN (both CASTOR and EOS).

Does this script find which logCollect tarball contains your unmerged logArchive?

@llayer
Copy link
Author

llayer commented Feb 22, 2019

Hi Alan,

thanks for your prompt reply.
If I understand correctly there is a procedure described in https://github.com/dmwm/WMArchive/wiki/Steps-for-Querying-LogCollect-file-Location to obtain the logArchive.tar.gz for the unmerged
jobs of a merged job.
So I assumed that a query like

{"fields": [], "spec": {"query": ["/store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz"], "timerange": [20190129, 20190131]}, "queries": ["/store/unmerged/logs/prod/2019/1/30/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/TOP-RunIISummer16NanoAODv4-00007_0/0000/0/b7b05f7d-f2d9-47ef-9465-06743ff2e9b1-305-0-logArchive.tar.gz"]}

gives me the tarball of the unmerged log Archive,
but I don't see how I can access this on EOS?

@llayer
Copy link
Author

llayer commented Feb 25, 2019

Hi Alan @amaltaro,

Do you have any quick solution to download the logArchive.tar.gz as described in the
post above? I tried a few things, but I was not able to find a working solution.
Having at least some log files downloaded would help me a lot to continue my work.

Thank you so much!

@amaltaro
Copy link

Lucas, you can find these LogCollect tarballs available in both EOS and CASTOR CERN storage:

amaltaro@lxplus029:~/tmp $ eoscms ls /store/logs/prod/2019/01/WMAgent/pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576/
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-cmsgli-4624449-0-cmswn2424-2-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-cmsgli-4658037-0-cmswn2330-2-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-cmsgli-5015108-0-cmswn2406-2-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node097-1-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node101-1-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node101-3-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node101-4-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node106-4-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node109-1-logs.tar
pdmvserv_task_TOP-RunIISummer16NanoAODv4-00007__v1_T_190129_142236_8576-LogCollectForTOP-RunIISummer16NanoAODv4-00007_0-node122-3-logs.tar

HTH. Alan.

@llayer
Copy link
Author

llayer commented Feb 25, 2019

Thank you Alan,

I finally managed to download the first files!

@llayer
Copy link
Author

llayer commented Apr 2, 2019

Dear @amaltaro,

for my project with @vlimant in the last weeks I collected the
logcollect paths for all failing logs that I need.
However while downloading them from cmseos I got the impression that logs
older than ~ 02/2018 are deleted, e.g. as a concrete example I cannot find

eos ls /store/logs/prod/2017/11/WMAgent/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751-LogCollectForEGM-RunIIFall17GS-00010_0-vmp365-1326-logs.tar
Unable to stat /eos/cms/store/logs/prod/2017/11/WMAgent/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751-LogCollectForEGM-RunIIFall17GS-00010_0-vmp365-1326-logs.tar; No such file or directory (errc=2) (No such file or directory)

and it seems that at least most folders in /store/logs/prod/2017/11/WMAgent/ are empty.
Can you confirm that the logs are deleted at some point and do you know whether they
are archived somewhere else?

Many, many thanks in advance!

@amaltaro
Copy link

amaltaro commented Apr 2, 2019

Hi @llayer , EOS is supposed to keep logs for only a couple of months, so consider yourself lucky for finding more than a year logs in there.
You need to access the archival storage in castor, you can use the same path as the one provided for EOS, but use /castor/cern.ch/cms/store/logs/prod/ instead (with xrdcp for instance).

@llayer
Copy link
Author

llayer commented Apr 2, 2019

Hi @amaltaro,

many thanks, I am now able to locate the files on castor, but copying with xrdcp returns me an
error:

xrdcp root://castorcms.cern.ch//castor/cern.ch/cms/store/logs/prod/2017/11/WMAgent/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751-LogCollectForEGM-RunIIFall17GS-00010_0-vmp365-1326-logs.tar . -OSsvcClass=OSsvcClass
[0B/0B][100%][==================================================][0B/s]  
Run: [ERROR] Server responded with an error: [3005] Unable to do async GET request. Insufficient privileges for user 76093,1399 performing a StageGetRequest request on svcClass 'OSsvcClass'; Communication error on send

Do I need some special permissions? I also cannot make a stager_qry:

stager_qry -M /castor/cern.ch/cms/store/logs/prod/2017/11/WMAgent/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751/pdmvserv_task_EGM-RunIIFall17GS-00010__v1_T_171013_161526_3751-LogCollectForEGM-RunIIFall17GS-00010_0-vmp365-1326-logs.tar .
Error: Permission denied
stage_filequery: Insufficient privileges for user 76093,1399 performing a StageFileQueryRequest request on svcClass ''

I am not familiar with castor, so if you could give me another hint it would be fantastic!
Best wishes and many thanks again,

Cheers,
Lukas

@amaltaro
Copy link

amaltaro commented Apr 3, 2019

Have you tried without the service class option? I haven't used castor for a long time now, so you might need to ask CERN IT via a snow ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants