How to Import PDF to Excel Using Power Query (Step-by-Step)

DC
DataConvertPro
~11 min read

How to Import PDF to Excel Using Power Query (Step-by-Step)

Copying and pasting data from a PDF into Excel is a nightmare. It's tedious, error prone, and usually ruins your formatting. Most people don't realize that Microsoft actually built a professional tool directly into Excel to solve this. It's called Power Query. This engine allows you to connect to a PDF file, extract structured tables, and clean the data before it ever hits your spreadsheet. If you've been manually typing numbers from a bank statement or an invoice, you're losing hours of your life. In this guide, we'll show you exactly how to use the native connector to automate your workflow. You'll learn how to handle messy layouts and multi page reports like a pro.

Why Power Query is a Game Changer for PDF Data

Before we dive into the steps, it's important to understand why this method beats every other "free" online converter. When you use a random website to convert a file, you're often handing over sensitive data to an unknown server. Plus, those converters are static. If your PDF layout changes slightly next month, you have to start all over again.

Power Query is different because it's a "Get and Transform" tool. It doesn't just copy the data. It remembers the steps you took to clean it. If you get a new version of the PDF next week, you simply click "Refresh" and Excel performs all those cleaning steps automatically. It's the difference between doing a chore and building a machine to do it for you.

Requirements for Importing PDF to Excel

You can't find the PDF connector in every single version of Excel. Microsoft rolled this out as a premium feature for specific builds.

  1. Microsoft 365 Subscription: This is the most reliable way to ensure you have the tool. If you're on a monthly or yearly 365 plan, you likely have it.
  2. Excel 2016 or Later: While Power Query exists in 2016 and 2019, the specific "From PDF" connector was added later. If you don't see it, you might need to update your Office installation.
  3. Windows OS: Currently, the PDF connector in Power Query is much more robust on Windows than it is on Mac. Mac users may find the options limited or missing entirely.
  4. Searchable Text: Power Query is great at reading "native" PDFs. These are files created by Word, Excel, or accounting software. If your PDF is a grainy photo of a document, Power Query will struggle because it doesn't have a built in high end OCR engine.

If you find that your version of Excel is missing these features, or you're dealing with hundreds of complex documents, you should check out our ultimate guide to PDF to Excel converters to find an alternative that fits your tech stack.

The Basic Import: Getting Your Data into the Editor

Let's start with a simple, single table PDF. This is the foundation for everything else we'll do.

Step 1: Connect to the File

Open a blank Excel workbook. Navigate to the Data tab on the top ribbon. Click on Get Data, then hover over From File, and finally select From PDF.

Screenshot description: The Excel interface showing the Data tab. The cursor is hovering over the 'From File' menu, highlighting the 'From PDF' option at the bottom of the list.

Step 2: Select Your PDF

A file browser window will pop up. Locate the PDF file you want to import on your hard drive and click Import.

Step 3: Use the Navigator Window

Once you click import, Excel will spend a few seconds analyzing the document. A Navigator window will appear. This is where the magic happens. On the left side, you'll see a list of everything Excel found.

Excel categorizes PDF data into two types:

  • Table Objects: These are structures that Excel recognized as intentional tables. They usually have headers and clear rows.
  • Page Objects: These represent the entire physical page. If your table isn't perfectly formatted, you might need to select the Page object instead of the Table object.

Click on a Table or Page in the left pane to see a preview on the right. Once you find the data you want, don't click Load. Instead, click Transform Data.

Screenshot description: The Navigator window showing a folder structure on the left with icons for 'Table001' and 'Page 1'. The right side shows a grid preview of the data with columns labeled Column1, Column2, etc.

The Power Query Editor: Cleaning the Mess

Clicking "Transform Data" opens the Power Query Editor. This is a separate window from Excel. It's a sandbox where you can reshape data without changing the original file.

Promoting Headers

Often, Power Query will label your columns as "Column1" or "Column2" and put your actual headers in the first row of data. To fix this, go to the Home tab in the editor and click Use First Row as Headers.

Removing Unnecessary Rows

PDFs often have headers, footers, or page numbers that get caught in the import. To get rid of them, click the filter arrow on one of your columns. Uncheck "Null" or any specific text that doesn't belong in your final table.

Changing Data Types

This is a crucial step. Power Query tries to guess if a column is a number, date, or text. Sometimes it gets it wrong. Look at the icon next to each column header.

  • 123 means Whole Number.
  • ABC means Text.
  • $ means Currency.

If your "Total Price" column is marked as text, you won't be able to sum it in Excel. Right click the column header, select Change Type, and choose the correct format.

If you encounter weird spacing or alignment issues during this stage, you aren't alone. Check out our guide on how to fix PDF table formatting issues in Excel for advanced cleaning tips.

Handling Multi-Page PDFs

One of the biggest headaches is a table that spans across five or ten pages. You don't want to import five separate tables and manually stitch them together. You want one continuous list.

The Appending Method

If your PDF has the same table structure on every page, you can use the Navigator to select multiple items. Check the box at the top that says Select multiple items. Select every table you need and click Transform Data.

Once in the editor, you'll see multiple queries on the left. You can use the Append Queries feature to stack them on top of each other. This creates a single master table.

The "Folder" Logic for Infinite Pages

If you have a 100 page PDF, selecting each page manually is a waste of time. Instead, you can use a clever trick in the Navigator. Instead of selecting a specific table, you can sometimes connect to the "Page" level and use a filter.

When you import a multi page PDF, the initial "Source" step in Power Query actually contains a list of all tables and pages. Instead of picking one, you can keep the "Data" column and the "Item" column, then expand the data. This will "unroll" every page into a single long table automatically.

Troubleshooting Common Power Query PDF Issues

Even with a tool this powerful, things can go wrong. Here are the most common hurdles and how to jump over them.

1. Merged Cells Causing Nulls

If your PDF has merged cells for category names, Power Query will put the name in the first row and "Null" in every row below it.
Solution: Select the column with the Nulls. Right click and select Fill, then Down. Excel will automatically replace those Nulls with the value from above.

2. Numbers Importing as Text

This often happens because of currency symbols like "$" or commas. Power Query sees the symbol and assumes it's a word.
Solution: Use the Replace Values feature to remove the "$" symbol. Then, change the data type to Decimal Number.

3. Data is "Shifted" or Misaligned

If one row has more data points than another, the columns might shift to the left. This usually happens in PDFs with varying column widths.
Solution: You may need to use the Split Column by Delimiter or Split Column by Position tool in the editor. Sometimes it's easier to merge two columns and then split them again using a specific character.

4. The "From PDF" Option is Missing

If you've checked your updates and it's still not there, you might be on a "Semi-Annual Enterprise Channel" for Office. These versions get features much later than the "Current Channel."
Solution: Talk to your IT department about switching your update channel or use a third party converter in the meantime.

When Power Query Isn't Enough

Power Query is an incredible tool, but it's built on top of Excel. It isn't a dedicated AI document processor. There are three specific scenarios where you'll hit a wall.

Scanned Documents and Images

If you're dealing with a PDF that was created by someone taking a photo of a piece of paper, Power Query will likely show you a blank screen or a single column of gibberish. It lacks the advanced optical character recognition (OCR) needed to interpret hand drawn lines or low resolution text.

Complex Nested Tables

If your PDF has tables inside tables, or a "dashboard" style layout with charts and text boxes scattered everywhere, Power Query will get confused. It looks for a consistent grid. If there's no grid, there's no import.

High Volume Automation

Power Query is great for one or two files a day. But what if you have 500 invoices arriving in an email inbox every morning? Opening Excel and clicking "Refresh" 500 times isn't a viable strategy. In those cases, you need a server side solution or an API that can handle the heavy lifting without human intervention.

If you find yourself spending more than thirty minutes cleaning up a Power Query import, it's a sign that the document is too complex for Excel's native tools. You might save more time by using a professional service.

FAQ: Frequently Asked Questions about PDF to Excel

Can Power Query extract data from password protected PDFs?

No, Power Query cannot bypass PDF security. You'll need to remove the password from the document using a PDF editor like Adobe Acrobat before you can connect to it via Excel.

Does Power Query work with scanned PDFs?

Only if the scan has already been processed with OCR. If you can't highlight the text with your mouse inside the PDF, Power Query won't be able to see it. It only reads the text layer, not the image layer.

How do I refresh the data if the PDF changes?

This is the best part. You don't have to repeat any steps. Just go to the Data tab in Excel and click Refresh All. Excel will look at the new PDF, apply all your transformations, and update the table in seconds.

Why does my data look different in the preview than in the final sheet?

The preview in the Navigator is just a "glimpse." Sometimes Excel simplifies the preview to save memory. Always click Transform Data to see the full, accurate version of your data.

Can I import multiple PDF files at once?

Yes, but the process is slightly different. Instead of "From PDF," you should choose Get Data > From File > From Folder. Put all your PDFs in one folder. Power Query will then combine all the files in that folder into one massive table, provided they all have the same structure.

Summary of the Power Query Workflow

  1. Connect: Data > Get Data > From File > From PDF.
  2. Select: Pick your file and choose the Table or Page object.
  3. Transform: Click Transform Data to enter the editor.
  4. Clean: Promote headers, fill nulls, and change data types.
  5. Load: Click Close & Load to bring the data into your spreadsheet.
  6. Refresh: Right click the table anytime to update it with new file data.

Expert Help for Complex Data Extraction

While Power Query is a fantastic tool for standard reports, many business documents are simply too messy for Excel to handle on its own. If you're struggling with thousands of pages, poor quality scans, or highly complex layouts that keep breaking your Power Query steps, we can help.

At DataConvertPro, we specialize in high accuracy data extraction that goes far beyond what a standard spreadsheet tool can do. We use advanced AI and custom parsing logic to ensure your data is perfect every time.

Stop fighting with messy PDFs. Get a custom quote today and let our experts handle the conversion for you. We'll deliver clean, structured Excel files so you can get back to the work that actually matters.

Ready to Convert Your Documents?

Stop wasting time on manual PDF to Excel conversions. Get a free quote and learn how DataConvertPro can handle your document processing needs with 99.9% accuracy.