Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new kb article convert-pdf-table-to-datatable #492

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions knowledge-base/convert-pdf-table-to-datatable.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: Converting PDF Table Content to DataTable
description: Learn how to transform a table from a PDF file into a DataTable object using the Telerik Document Processing libraries.
type: how-to
page_title: How to Convert PDF Table to DataTable with Telerik Document Processing
slug: convert-pdf-table-to-datatable
tags: document, processing, table, datatable, convert, pdf, excel
res_type: kb
ticketid: 1675626
---

## Environment

| Version | Product | Author |
| ---- | ---- | ---- |
| 2024.4.1106| Telerik Document Processing Libraries|[Desislava Yordanova](https://www.telerik.com/blogs/author/desislava-yordanova)|

## Description

Learn how to convert a specific table from a PDF file into a DataTable object using **Telerik Document Processing** libraries.

## Solution

Telerik Document Processing libraries **do not** offer a **direct** method to convert a PDF table to a DataTable object. However, a feasible workaround is available. This method involves utilizing MS Excel or [RadSpreadsheet](https://docs.telerik.com/devtools/winforms/controls/spreadsheet/overview) for the intermediary conversion step.

1. Select and copy the desired table's content from the PDF file.
2. Paste the copied content into **MS Excel** or **RadSpreadsheet**. This step converts the PDF table into an Excel format.
3. Save the document into XLSX with [RadSpreadProcessing]({%slug radspreadprocessing-overview%}).
4. Use the RadSpreadProcessing library to convert the Excel document into a DataTable. Utilize the [DataTableFormatProvider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%}) from RadSpreadProcessing for this conversion.

Here is a code snippet demonstrating the conversion of an XLSX document to a DataTable using RadSpreadProcessing:

```csharp
using Telerik.Windows.Documents.Spreadsheet.FormatProviders.OpenXml.Xlsx;
using Telerik.Windows.Documents.Spreadsheet.Model;
using System.Data;
using Telerik.Windows.Documents.Spreadsheet.FormatProviders;

// Load the XLSX file
Workbook workbook;
using (FileStream input = new FileStream("path_to_your_xlsx_file.xlsx", FileMode.Open))
{
IWorkbookFormatProvider formatProvider = new XlsxFormatProvider();
workbook = formatProvider.Import(input);
}

// Convert the first worksheet to DataTable
Worksheet worksheet = workbook.Worksheets[0];
DataTable dataTable = new DataTable();

DataTableFormatProvider dataTableFormatProvider = new DataTableFormatProvider();
dataTable = dataTableFormatProvider.Export(worksheet);
```

This solution provides a way to parse PDF table content and use it as a DataTable, leveraging the powerful features of Telerik Document Processing libraries.

## See Also

- [RadWordsProcessing Overview]({%slug radwordsprocessing-overview%})
- [RadSpreadProcessing Overview]({%slug radspreadprocessing-overview%})
- [Using DataTable Format Provider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%})
- [Import and Export to Excel File Formats]({%slug radspreadprocessing-formats-and-conversion-xlsx-xlsxformatprovider%})
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,5 @@ Example 3 demonstrates how you can export an existing Worksheet to a DataTable.

# See Also

* [Settings]({%slug radspreadprocessing-formats-and-conversion-data-table-formatprovider-settings%})
* [Settings]({%slug radspreadprocessing-formats-and-conversion-data-table-formatprovider-settings%})
* [Converting PDF Table Content to DataTable]({%slug convert-pdf-table-to-datatable%})