VBA get data from website

VBA is a very useful helper in the Microsoft Office world to automate everyday recurring processes. Such a process can be that you need to read information from a certain web page and transfer it to Excel, Word or another file, for example. Of course, this can also be done by copy and paste. But it would be much better if you could read the source code of a web page with one click and then process it further.

Fetching HTML code from web page

The following image shows an Excel spreadsheet as an example, in which an Internet address/URL is stored in the first cell. In addition, a “Read web page” button has been added to the Excel table. This is done via the “Developer tools” ➥ “Insert” menu.

Now, when you click on the button, the web page/URL that is entered in the first cell is to be read out for further use.

Screenshot VBA-Editor Sub to read a website
VBA code for reading out a web page (the source code/HTML code is read out)

By right-clicking on the button and selecting “Assign Macro…”, you then assign the VBA function to be executed to the button. In this case the function is called “BtnReadWebsite_Click”.

Screenshot Excel assign macro
right click and assign macro

In the dialog that opens, you can select the SUB-method that reads the HTML code of the web page.

Screenshot Excel Makro zuweisen zum Auslesen des Quelltexts einer Webseite
Makro zuweisen zum Auslesen des Quelltexts einer Webseite

VBA code for scraping URL

To read the content of a web page, use the following VBA code in the VBA editor:

Sub BtnReadWebsite_Click()
    Dim url As String
    url = Cells(1, 1).Value
    URL_Load url
End Sub

Private Sub URL_Load(ByVal sURL As String)
    Dim appInternetExplorer As Object
    Dim htmlTxt As String
    Set appInternetExplorer = CreateObject("InternetExplorer.Application")
    appInternetExplorer.navigate sURL
    Do: Loop Until appInternetExplorer.Busy = False
    Do: Loop Until appInternetExplorer.Busy = False
    htmlTxt = appInternetExplorer.document.DocumentElement.outerHTML
    Debug.Print htmlTxt
    Set appInternetExplorer = Nothing
    'Do something here with html source code: parse, output, save,...
    MsgBox "The text was read out!!"
End Sub

In the first Sub-routine the internet address is read from the Excel cell and passed to the URL_Load function.

The URL_Load function then creates an Internet Explorer instance that loads the web page. The “Loop” functions are used to bridge the time needed for loading the web page.

Via appInternetExplorer.document.DocumentElement.outerHTML the assignment of the read HTML code takes place.

Now you can use the string variable “htmlText” to process the text, e.g on position of the comment. In the example here the text is printed via Debug.Print in the direct window of the VBA editor. But you can also add the text to the clipboard or paste it into a concrete Excel cell.

In most cases it will be necessary to parse the source code of the web page, because you need only a specific information from the web page.

Bestseller No. 1
Bestseller No. 2

Leave a Reply

Your email address will not be published.