Skip to content

Commit

Permalink
🐛 Fix rerun for entries that came from zips
Browse files Browse the repository at this point in the history
Previously, when rerunning an entry that came from a zip file, the rerun
would fail because it would look for the CSV in an assumed location
which is based on it's last importer ID.  Since this was a rerun, it
does not do an unzip into the assumed location so the directory does not
exist.

This commit will first check if the assumed location exists, and if not,
it will look for the location of the last unizpped files and use that
for the rerun.

This does cause an interesting behavior where if the entry is a work
with a file attached, it will add the file again resulting in duplicate
files.  I feel this is such an edge case though because typically if the
entry is successful, the user will not rerun it.  I added a hint text to
the importer to let the user know this is a possibility.

Ref:
- notch8/palni_palci_knapsack#210
  • Loading branch information
kirkkwang committed Jan 28, 2025
1 parent 0e8de61 commit 32083d9
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 1 deletion.
19 changes: 18 additions & 1 deletion app/parsers/bulkrax/csv_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,10 @@ def path_to_files(**args)
@path_to_files = File.join(
zip? ? importer_unzip_path : File.dirname(import_file_path), 'files', filename
)

return @path_to_files if File.exist?(@path_to_files)

File.join(real_importer_unzip_path, 'files', filename) if file? && zip?
end

private
Expand All @@ -379,7 +383,7 @@ def unique_collection_identifier(collection_hash)
# We expect a single CSV at the top level of the zip in the CSVParser
# but we are willing to go look for it if need be
def real_import_file_path
return Dir["#{importer_unzip_path}/**/*.csv"].reject { |path| in_files_dir?(path) }.first if file? && zip?
return Dir["#{real_importer_unzip_path}/**/*.csv"].reject { |path| in_files_dir?(path) }.first if file? && zip?

parser_fields['import_file_path']
end
Expand All @@ -389,5 +393,18 @@ def real_import_file_path
def in_files_dir?(path)
File.dirname(path).ends_with?('files')
end

# If we don't have an existing unzip path, we'll try and find it.
# Just in case there are multiple paths, we sort by the number at the end of the path and get the last one
def real_importer_unzip_path
return importer_unzip_path if Dir.exist?(importer_unzip_path)

Dir.glob(base_importer_unzip_path + '*').sort_by { |path| path.split(base_importer_unzip_path).last[1..-1].to_i }.last
end

def base_importer_unzip_path
# turns "tmp/imports/tenant/import_1_20250122035229_1" to "tmp/imports/tenant/import_1_20250122035229"
importer_unzip_path.split('_')[0...-1].join('_')
end
end
end
1 change: 1 addition & 0 deletions app/views/bulkrax/importers/_edit_item_buttons.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
<h5>Options for Updating an Entry</h5>
<hr />
<p>Rebuild metadata and files.</p>
<p class="text-warning">Files may be duplicated if this option is used on a successful entry. Consider using <em>Remove and then Build</em> instead.</p>
<%= link_to 'Build', item_entry_path(item, e), method: :patch, class: 'btn btn-primary' %>
<hr />
<p>Remove existing work and then recreate the works metadata and files.</p>
Expand Down
Binary file added spec/fixtures/csv/files/moon.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 32083d9

Please sign in to comment.