Replies: 2 comments 3 replies
-
It's worth trying With the recent activity around async task frameworks it would be fairly simple to move csv upload to an async process and email the user once it's done. |
Beta Was this translation helpful? Give feedback.
-
I added the last line here to my mssql.py:
But it's not causing the
So either I added it in the wrong place for it to take effect, or the codebase is broken and it doesn't properly ingest and apply this setting... |
Beta Was this translation helpful? Give feedback.
-
Summary: uploading a spreadsheet to the data warehouse via Superset is unsatisfactory in two ways:
This post is about (1) but either way, there should be a better user experience to address (2). Could it be handled by a worker instead of the main Superset application?
Specifics:
My users think that their spreadsheet uploads fail because they get a gateway error message after 30 or 60 seconds. I benchmarked an upload of an .xlsx file using Superset 4.1.0rc2. It is 182kb file size with just over 5k rows and 4 columns of data -- 1 date and 3 float columns. Not a huge file.
Benchmarking:
to_sql
,method = 'multi'
: 16 secondsto_sql
andmethod = None
: 250 secondsPossible fix:
Poking around in the code base, it looks like if I changed the MSSQL db engine spec, I could get Superset to write with
method = 'multi'
, see here:superset/superset/db_engine_specs/base.py
Line 1286 in 5e42d7a
I guess I could just add
supports_multivalues_insert = True
to the SQL Server spec?But I'm confused, why then does
supports_multivalues_insert
not appear in a single existing db engine spec as far as I see? The only idea I have is that all of the commonly-used DBs by developers override thedf_to_sql
function so don't care about this variable?Beta Was this translation helpful? Give feedback.
All reactions