Need to load an HTML string or webpage into Excel for processing, visualization, or storage? This guide explains how to convert HTML content directly to an Excel workbook using Aspose.Cells for .NET.
Introduction
Converting HTML data to Excel can be a challenging task due to the complexity of HTML structures and the need for precise formatting in Excel. However, with Aspose.Cells for .NET, this process becomes straightforward and efficient.
This article will walk you through the steps required to convert an HTML string or file into an Excel workbook using C#. We’ll cover everything from installing the necessary package to saving the final output as a .xlsx
file.
Use Cases for HTML to Excel Conversion
- Convert email or CMS data into Excel
- Process HTML reports or exports from third-party platforms
- Import web tables into structured spreadsheets
Step-by-Step Guide
Step 1: Install Aspose.Cells for .NET
$ dotnet add package Aspose.Cells
Step 2: Prepare HTML as a String
Prepare your HTML content either from an external source or directly within the code.
string htmlString = "<html><body><table><tr><td>Item</td><td>Price</td></tr><tr><td>Book</td><td>20</td></tr></table></body></html>";
Step 3: Convert String to Stream
Convert the HTML string into a MemoryStream
object, which is required for loading HTML content.
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(htmlString)))
{
// Proceed with loading the stream into an Excel workbook
}
Step 4: Load HTML Stream with HtmlLoadOptions
Use HtmlLoadOptions
to load the HTML content from the MemoryStream
and create a new Workbook
object.
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(htmlString)))
{
Workbook workbook = new Workbook(htmlStream, new HtmlLoadOptions());
}
Step 5: Work With the Workbook (Optional)
Once the HTML content is loaded into a Workbook
, you can manipulate it as needed. For example, you might want to add formulas or styles.
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(htmlString)))
{
Workbook workbook = new Workbook(htmlStream, new HtmlLoadOptions());
Worksheet sheet = workbook.Worksheets[0];
// Add formulas, styles, or modify data
}
Step 6: Save the Resulting Excel File
Finally, save the modified Workbook
to a file on disk.
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(htmlString)))
{
Workbook workbook = new Workbook(htmlStream, new HtmlLoadOptions());
// Optional: Modify the data or format
Worksheet sheet = workbook.Worksheets[0];
sheet.AutoFitColumns();
workbook.Save("converted.xlsx", SaveFormat.Xlsx);
}
Complete Code Example
Here is a complete example that demonstrates how to convert an HTML string into an Excel file using Aspose.Cells for .NET.
using System;
using System.IO;
using System.Text;
using Aspose.Cells;
public class HtmlToExcelConverter
{
public static void Main()
{
string html = "<html><body><table><tr><td>Name</td><td>Score</td></tr><tr><td>Alice</td><td>92</td></tr></table></body></html>";
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(html)))
{
Workbook workbook = new Workbook(htmlStream, new HtmlLoadOptions());
Worksheet sheet = workbook.Worksheets[0];
sheet.AutoFitColumns();
workbook.Save("html_to_excel.xlsx", SaveFormat.Xlsx);
}
Console.WriteLine("HTML converted to Excel.");
}
}
Best Practices
Practice | Benefit |
---|---|
Use streams for web integration | Easier to handle in APIs |
Use HtmlLoadOptions | Customize parsing or handle advanced HTML content |
Auto-fit columns | Improve readability of output |
Subsection: Handling Complex HTML Structures
When dealing with complex HTML structures, it’s important to use the HtmlLoadOptions
class to customize how Aspose.Cells parses the HTML. This can include setting options such as ignoring certain elements or handling specific tags differently.
using System;
using System.IO;
using System.Text;
using Aspose.Cells;
public class ComplexHtmlToExcelConverter
{
public static void Main()
{
string complexHtml = @"
<html>
<body>
<table border='1'>
<tr>
<th>Product</th>
<th>Price</th>
<th>Quantity</th>
</tr>
<tr>
<td>Laptop</td>
<td>$999.99</td>
<td>5</td>
</tr>
<tr>
<td>Smartphone</td>
<td>$699.99</td>
<td>10</td>
</tr>
</table>
</body>
</html>";
// Create a MemoryStream from the HTML string
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(complexHtml)))
{
// Use HtmlLoadOptions to customize parsing if needed
HtmlLoadOptions loadOptions = new HtmlLoadOptions();
loadOptions.AutoFilterColumns = true; // Enable auto-filter for columns
// Load the HTML content into a Workbook
Workbook workbook = new Workbook(htmlStream, loadOptions);
// Access the first worksheet
Worksheet sheet = workbook.Worksheets[0];
// Auto-fit all columns to improve readability
sheet.AutoFitColumns();
// Save the resulting Excel file
workbook.Save("complex_html_to_excel.xlsx", SaveFormat.Xlsx);
}
Console.WriteLine("Complex HTML converted to Excel.");
}
}
Conclusion
Converting HTML content to Excel using Aspose.Cells for .NET is a powerful way to integrate web data into your Excel workflows. By following this guide, you’ll be able to handle both simple and complex HTML structures with ease.