-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excel Exporter - support for options #37
Conversation
f46cff2
to
365ee8e
Compare
test.xls was produced while trying out in pry, and was mistakenly added into git history
(Before the review) Please-please, try to do the descriptive descriptions (sorry for tautology) for your PRs. To write "This enchances this and that, while leaving that and this intact" will take just 5 minutes of your time, but all reviewers would be really grateful. |
lib/daru/io/exporters/excel.rb
Outdated
# "dataframe_df.xls", | ||
# data_options: normal, | ||
# header_options: heading | ||
# ).call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm totally not sure, but don't you think it is something else? Like "format of ...":
Daru::IO::Exporters::Excel.new(formatting: {header: blah, data: blah}).call(path)
# or even...
Daru::IO::Exporters::Excel.new(formatting: {order: blah, index: blah, data: blah}).call(path)
(BTW, as you can see by my example, I am totally not sure where path
should be passed... We can discuss it later, though, leave your version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, all arguments are given only in the initialize()
method and none to call()
.
Daru::IO::Exporters::Excel.new(
df,
'path/file.xls',
formatting: {header: {...}, data: {...}, index: {...}},
display: {header: true, index: false, data: true}
)
Would it be better to rename :header
to :order
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, all arguments are given only in the initialize() method and none to call().
OK, let's leave it this way, at least for now.
Would it be better to rename
:header
to:order
?
¯\_(ツ)_/¯ I can imagine a lot of pro and contra arguments. But probably we should stick to "dataframe point of view" (data, index, order). But² maybe "index & orders" could be called "headers" in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it is EXTREMELY bad time to ask, but why you've choose spreadsheet
gem?.. It doesn't looks reasonable for me, TBH.
lib/daru/io/exporters/excel.rb
Outdated
# | ||
# @option options formatting [Hash<Hash>] A Hash of formatting options. Supported keys | ||
# are +:header+, +:data+ and +:index+. For example, if +:header+ is set to a Hash | ||
# containing keys such as +:color+, the headers are written with the specified color. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two hashes with same keys looks suspicious to me... Maybe we can join them, this way:
to_excel(dataframe, 'test.xlsx',
header: false, # or nil, don't show
index: {color: :red, bold: true}, # show, custom formatting
data: true # show, default formatting, can be omitted
WDYT?..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged, this does seem much more intuitive to use. Made this and the below changes with commit 6fe329a.
lib/daru/io/exporters/excel.rb
Outdated
# index: { color: :green }, | ||
# data: { color: :blue } | ||
# }, | ||
# display: { index: false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...for example, this example is weird, first set formatting of index, then tell we don't want to show it ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the use-case I was considering is when a daru-io user decides to fix their :formatting
and just decide to play-around with the :display
hash.
BTW, after having gone through the daru-view repository during the review discussion, I came across some frequently-used default options being declared as a class variable like Daru::View.adapter = :highcharts
. Thinking about it, could it be used to define some default options? (Maybe, coupled with meta-programming?) That is, something like Daru::IO::Exporters::Excel.index = {color: :green}
? Or does this seem complex?
lib/daru/io/exporters/excel.rb
Outdated
@options = options | ||
@path = path | ||
@display = options[:display] || {} | ||
@formatting = options[:formatting] || {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetch
lib/daru/io/exporters/excel.rb
Outdated
sheet = book.create_worksheet | ||
@book = Spreadsheet::Workbook.new | ||
@sheet = @book.create_worksheet | ||
@row_offset = @display[:header] ? 1 : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say write_headers
should shift the @row_offset
.
lib/daru/io/exporters/excel.rb
Outdated
def process_col_offset | ||
return 0 unless @display[:index] | ||
return 1 unless @dataframe.index.first.is_a?(Array) | ||
@dataframe.index.first.count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have more idiomatic way to do it?.. Like
return 1 unless @dataframe.index.is_a?(MultiIndex)
@dataframe.index.levels
lib/daru/io/exporters/excel.rb
Outdated
format = Spreadsheet::Format.new color: :blue, weight: :bold | ||
if @display[:index] || @display[:data] | ||
@dataframe.each_row_with_index.with_index do |row_idx, i| | ||
row, idx = row_idx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just each_row_with_index.with_index do |(row, idx), i|
, probably
lib/daru/io/exporters/excel.rb
Outdated
@dataframe.each_row_with_index.with_index do |row_idx, i| | ||
row, idx = row_idx | ||
|
||
write_index(idx, i+@row_offset) if @display[:index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe those if
s place is inside write_index
/write_data
lib/daru/io/exporters/excel.rb
Outdated
@sheet.row(0).concat([' '] * @col_offset + @dataframe.vectors.to_a.map(&:to_s)) | ||
end | ||
|
||
def write_index(idx, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why , i
if it should be row
or at least r
?..
lib/daru/io/exporters/excel.rb
Outdated
|
||
book.write(@path) | ||
def write_data(row, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
lib/daru/io/exporters/excel.rb
Outdated
|
||
book.write(@path) | ||
def write_data(row, i) | ||
(@col_offset..(@col_offset + @dataframe.vectors.to_a.size-1)).each do |x| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vectors.to_a.size
, seriously? More idiomatic way?..- Instead of
-1
, you should use just excluding range (...
). Or you can also do@dataframe.ncols.times
. Or evenrow.size.times
x
is extremely bad name
@zverok - The existing Excel Exporter was already based on 'Spreadsheet' gem, and I didn't want to break any backward dependencies by changing it. Also, I didn't find any compelling reason to change the Excel Exporter to depend on a different gem. Hence, I chose to stick with the 'Spreadsheet' gem itself. If I'm missing out on any issue with using 'Spreadsheet' gem / if there are better alternatives that you'd like to suggest, please feel free to suggest. 😄 Regarding rest of the review comments, acknowledged. I'll soon make the changes. |
YARD Docs have been updated. Using the :data, :index, :header keyword arguments seems to be much more easier and intutive than using something like options[:display][:data]. Code quality has been improved, as per review comments.
@zverok - I've made the above changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good, don't forget about last comments.
lib/daru/io/exporters/excel.rb
Outdated
process_offsets | ||
write_headers | ||
|
||
@dataframe.each_row_with_index.with_index do |(row, idx), i| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, r|
or , dr|
(delta row)
lib/daru/io/exporters/excel.rb
Outdated
def process_offsets | ||
@row_offset = @header ? 1 : 0 | ||
@col_offset = 0 unless @index | ||
@col_offset ||= @dataframe.index.is_a?(Daru::MultiIndex) ? @dataframe.index.levels.size : 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, @dataframe.index.width
for MultiIndex
lib/daru/io/exporters/excel.rb
Outdated
@sheet.row(row).concat(idx.is_a?(Array) ? idx.to_a.map(&:to_s) : [idx.to_s]) | ||
return unless @index.is_a?(Hash) | ||
|
||
@col_offset.times { |col| @sheet.row(row).set_format(col, Spreadsheet::Format.new(@index)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a bit cleaner code, I'd define helper method:
def format(col_range, row, format)
return unless format.is_a?(Hash)
....
end
Then here you can just
format(0...@col_offset, row, @index)
and below
format(@col_offset...@col_offset + @dataframe.ncols, idx, @data)
WDYT?..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this looks much better & DRYier. I've added this with formatting
method, and other methods like write_data
, write_index
redirect to formatting
with different arguments. 👍
OK, let's merge it. But please add issue about underlying gem. The problem is, there are basically 2 (absolutely different) formats of Excel files: old binary one ( |
No description provided.