Incorporating HTML into wxPython : Part 1

wxPy in action

Most of the GUI programming toolkits are large and complex, so they are not very easy to use. wxPython is an Open Source, cross-platform GUI toolkit for the Python programming language. wxPython has gained the popularity in a short duration because it offers productivity gains and useful features for any programmer also it allows programmers to create programs with a robust, highly functional graphical user interface, simply and easily.

Manning recently released the book “wxPython in Action”, It is the only published source for the wxPython toolkit, this book can show you, why wxPython is a better interface tool than Tkinter and other GUI toolkits. Following is an excerpt from the book. It covers – HTML with wxPython, HTML parser, tags and file formats, using widgets in HTML etc.


Adapted from: wxPython in Action.
Written by Noel Rappin and Robin Dunn.
Released on: 23 March, 2006
ISBN: 1-932394-62-1
Reprinted by permission of Manning Publications Co.

Incorporating HTML into your application

This chapter covers

  • Displaying HTML in a wxPython window
  • Manipulating and printing HTML windows
  • Using the HTML parser
  • Supporting new tags and other file formats
  • Using widgets in HTML

Originally intended as a simple semantic markup for a hypertext system used by physicists, HTML has since become more complex and widespread. Over time, HTML’s document markup has proven useful outside of a web browser, and is now often used as a commonly understood minilanguage for general text markup (as in a text control), or to manage a series of hyperlinked pages (as in a help system). In wxPython, there are a number of features dedicated to managing your HTML needs within your application. You can display simple HTML in a window, follow hyperlinks, create your own HTML help pages, and even embed a more fully featured browser if you need more complexity.

16.1

16.1 Displaying HTML

The most important thing you can do with HTML in wxPython is display it in a window. Over the next two sections we’ll discuss the HTML Window object and show how you can use it on your own local text or on a remote URL.

16.1.1 How can I display HTML in a wxPython window?

HTML within wxPython is a useful mechanism for quickly describing a text layout involving styled text or a simple grid, as we discussed in chapter 6. The wxPython wx.html.HtmlWindow class is used for this purpose. Its goal is to display HTML, making it a fancy static text control with hypertext links. Figure 16.1 displays a modest example.


Listing 16.1 displays the code used to create figure 16.1.

Listing 16.1  Displaying the simple HtmlWindow

import wx

import wx.html

class MyHtmlFrame(wx.Frame):

def __init__(self, parent, title):

wx.Frame.__init__(self, parent, -1, title)

html = wx.html.HtmlWindow(self)

if “gtk2” in wx.PlatformInfo:

html.SetStandardFonts()

html.SetPage(

“Here is some <b>formatted</b> <i><u>text</u></i> ”

“loaded from a <font color=\”red\”>string</font>.”)

app = wx.PySimpleApp()

frm = MyHtmlFrame(None, “Simple HTML”)

frm.Show()

app.MainLoop()

As you can see, the wx.html.HtmlWindow is declared and used the same way as every other wxPython widget, however, you must import the wx.html module, as wx. html.HtmlWindow is declared in a submodule of the wx package along with several HTML helper classes. The constructor is nearly identical to wx.ScrolledWindow.

wx.html.HtmlWindow(parent, id=-1, pos=wx.DefaultPosition,

size=wx.DefaultSize, style=wx.html.HW_SCROLLBAR_AUTO,

name=”htmlWindow”)


All of these parameters should look familiar by now. The most important difference is that the default style is wx.html.HW_SCROLLBAR_AUTO, which tells the HTML window to automatically add scrollbars as needed. The opposite style—never display scrollbars—uses the style flag wx.html.HW_SCROLLBAR_NEVER. One more HTML window style to use is wx.HW_NO_SELECTION, which prevents the user from making a text selection in the window.

When writing the HTML for display in the HTML window, remember to keep it simple. The widget is designed for simple styled text display, not for use as a full multimedia hypertext system. Most basic text tags are supported, but more advanced features like cascading style sheets and JavaScript are not. Highly complex tables and image setups may work, but you’re setting yourself up for a fall. Table 16.1 contains the officially supported HTML tags. In general, tags and attributes behave as they would in a web browser, however, this is not a full-fledged browser, and there are likely to be cases that behave oddly. So that there is no confusion, table 16.1 is not in perfect HTML syntax—it’s just the tag name followed by the list of supported attributes for that tag, if any. When using the HTML window you’ll need to use proper HTML syntax.

Table 16.1 Valid HTML tags for the HTML window widget FILENAME

Category

Valid tags

Document Structure Tags

<a href name target>
<body alignment bgcolor link text>
<meta content http-equiv>
<title>

Text Structure Tags

<br>
<div align>
<hr align noshade size width>
<p>

Text Display Tags

<address>
<b> <big> <blockquote>
<center> <cite> <code>
<em>
<font color face size>
<h1> <h2> <h3> <h4> <h5> <h6>
<i> <kbd> <pre>
<samp> <small> <strike> <string>
<tt>
<u>

List Tags

<dd> <dl> <dt> <li> <ol> <ul>

Image and Map Tags

<area coords href shape> <img align height src width usemap> <map name>

Table Tags

<table align bgcolor border cellpadding cellspacing valign width>
<td align bgcolor colspan rowspan valign width nowrap>
<th align bgcolor colspan valign width rowspan> <tr align bgcolor valign>

The HTML window uses wx.Image to load and display images, so it can support all the image file formats that wx.Image does.


16.1.2 How can I display HTML from a file or URL?

Once you have an HTML window created, the next challenge is to display the HTML text in the window. The following four methods are used to get HTML text into the window.

  • SetPage(source)
  • AppendToPage(source)
  • LoadFile(filename)

LoadPage(location)

The most direct is the method SetPage(source), where the source parameter is a string containing the HTML source that you want displayed in the window.

Once you have text in the page, you can append HTML to the end of text that is currently in the window with the method AppendToPage(source). For both the SetPage() and AppendToPage() methods, the code assumes that the source is HTML, meaning that if you pass it plain text, the spacing is ignored in keeping with the HTML standard.
If you want your window to behave more like a browser by actually browsing external resources, you have two options. The method LoadFile(filename) reads the contents of a local file and displays them in the window. In this case, the window takes advantage of the MIME type of the file to load an image file or an HTML file. If it can’t decide which type the file is, it loads the file as plain text. If the document that is loaded contains relative links to images or other documents, the base location used to resolve those links is the location of the original file.
Of course, a real browser isn’t limited to mere local files. You can load a remote URL with the method LoadPage(location), where the location is typically a URL, but could also be a pathname to a local file. The MIME type of the URL is used to determine how the page is loaded. Later in this chapter, we’ll describe how to add support for new file types.

Figure 16.2 displays a page
loaded into an HTML window.


Listing 16.2 displays the code used to display figure 16.2.

Listing 16.2  Loading the HTML window content from a web page

import wx

import wx.html

class MyHtmlFrame(wx.Frame):

def __init__(self, parent, title):

wx.Frame.__init__(self, parent, -1, title, size=(600,400))

html = wx.html.HtmlWindow(self)

if “gtk2” in wx.PlatformInfo:

html.SetStandardFonts()

html.LoadPage(

“http://wxwidgets.org/manuals/2.5.4/wx_wxbutton.html”)

app = wx.PySimpleApp()

frm = MyHtmlFrame(None, “Simple HTML Browser”)

frm.Show()

app.MainLoop()

The key point in listing 16.2 is the LoadPage() method. A more full-featured window would probably display the URL in a text box, and change the window contents when the user enters a new URL.

16.2 Manipulating the HTML window

Once you have an HTML window, you can manage it in various ways. You can trigger actions based on user input, manipulate the contents of the window, automatically bind the containing frame to display information about the window, and print the page. In the following sections, we’ll describe how to accomplish each of these.


16.2.1 How can I respond to a user click on an active link?

The use of a wx.html.HtmlWindow is not limited to display. You can also respond to user input. In this case, you do not need to define your own event handlers, as the C++ code comes with a set of predefined handlers that you can override in your own subclass of wx.html.HtmlWindow. This is the sort of thing you’d do to make your HTML window behave like an actual browser, including following hyperlinks, and displaying tooltips when the user hovers over a link.

Table 16.2 describes the defined handlers. The wx.html.HtmlWindow class does not define these events using the normal event system, so you must handle them with these overloaded member functions, rather than binding them as event types.

Table 16.2 Event handlers of wx.html.HtmlWindow

Method

Description

OnCellClicked
(cell, x, y, event)

Called when the user clicks inside the HTML document. The cell argument is a wx.html.HtmlCell object representing a portion of the displayed document, usually something like a run of same-styled text, a table cell, or an image. The wx.html.HtmlCell class is created by the HTML parser, and will be discussed later in the chapter. The x and y coordinates are the exact pixel location of the mouse click, and the event is the relevant mouse click event. The default version of this method simply delegates to OnLinkClicked() if the cell contains a link, otherwise it does nothing.

OnCellMouseHover
(cell, x, y)

Called when the user rolls the mouse over an HTML cell, where a cell has the same definition as above. The arguments are as in OnCellClicked().

OnLinkClicked(link)

Called when the user clicks on a hyperlink. The link argument is of the parser-created class wx.html.HtmlLinkInfo, and contains the information needed to load the linked resource. The default version of the method calls LoadPage on the URL in the link. A common use case for overriding this method is to use an HtmlWindow to make a fancy about box for an application. In that case you might change the behavior so the user clicks homepage to launch’s the system’s default browser using Python’s webbrowser module.

OnOpeningURL
(type, url)

Called when the user requests a URL to open, whether it is a page or an image that is part of a page. The type argument is wx.html.HTML_URL_PAGE, wx.html.HTML_URL_IMAGE, or wx.html.HTML_URL_OTHER. This method returns one of the following—wx.html.HTML_OPEN to allow the resource to load, wx.html.HTML_BLOCK to prevent the resource from loading, or a string that will be used as a url redirect, and this method is then called again on the redirected location. The default version of this method always returns wx.html.HTML_OPEN.

OnSetTitle(title)

Called when the HTML source has a <title> tag. Generally used to display that title elsewhere in the application.

Again, if you want an HTML window that responds to user input, you must create your own subclass and override these methods.

16.2.2

16.2.2 How can I change an HTML window programmatically?

If you are displaying an HTML page, there’s a good chance that your frame is behaving like a browser in one way or another. Even if it’s not actually browsing the web, it could be browsing help files, or other kinds of linked data. If your user is browsing, the text being changed in your display also needs to change in response to user information.

There are a couple of ways to access and change information in the HTML window while it’s running. First, you can get the URL of the currently opened page with the method GetOpenedPage(). This method only works if the current page was loaded using the LoadPage() method. If so, the return value is the URL of the current location (as a string). If not, or if there is no currently open page, the method returns an empty string. There’s a related method, GetOpenedAnchor(), that returns the anchor within the currently opened page. If the page was not opened with LoadPage(), you get an empty string.

To get the HTML title of the current page, use the method GetOpenedPageTitle(), returning whatever value is contained in the current page’s <title> tag. If the current page doesn’t have a <title> tag, you get an empty string.

There are a few methods for changing the text selection within the window. The method SelectAll() changes the text selection to the entire body text of the opened page. You can make a more specific selection with SelectLine(pos) or SelectWord(pos). In both cases, the pos argument is the wx.Point of the mouse position, and selects either the entire line or just the word at that point. To extract the current selection as plain text you can use the SelectionToText() method, while the method ToText() returns the entire document as plain text.

The wx.html.HtmlWindow maintains a history list of the source pages loaded into it. Using the methods listed in table 16.3, that history list can be navigated as in a typical browser.

Table 16.3 History methods of wx.html.HtmlWindow

Method

Description

HistoryBack()

Loads the previous entry in the history list. Returns False if there is no such entry.

HistoryCanBack()

Returns True if there is a previous entry in the history list, False otherwise.

HistoryCanForward()

Returns True if there is a next entry in the history list, False otherwise.

HistoryClear()

Empties the history list.

HistoryForward()

Loads the next entry in the list. Returns False if there is no such entry.

To change the fonts being used, use the method SetFonts(normal_face, fixed_ face, sizes=None). The normal_face argument is the string name of the font you want to use for the proportional font in the window display. If the normal_face is an empty string, the system default is used, otherwise, the exact font names used are dependent on the operating system. The fixed_face argument works similarly, and specifies the font used for monospaced text in your browser (for example, within <pre> tags). If specified, the sizes element is a Python list of seven integers representing the absolute font sizes that correspond to the HTML logical font sizes between -2 and +4 (as used in a <font> tag). If the argument is None or not specified, defaults are used. There are default constants for wx.html.HTML_ FONT_SIZE_n, where n is between 1 and 7. These constants specify the default font used for the corresponding HTML logical font size. The exact values of the constants may differ depending on the underlying system. To select a set of fonts and sizes that are based on the user’s system preferences (rather that the hard-coded defaults) call SetStandardFonts(). This is especially useful when running wxPython under GTK2, as it produces a better set of fonts.

If for some reason you need to change the distance between the edge of the window text and the edge of the window, the HTML window defines the method SetBorders(b). The b argument is the integer pixel width between the edge of the window and the beginning of the text.

16.2.3 How can I display the page title in a frame’s title bar?

One thing you’ve probably noticed in your web browser is that the display window is not the only element of the browser. Among the other elements of note are a title bar and status bar in the containing frame. Typically, the title bar displays the title of the HTML page being displayed, and the status bar displays information about links as they are moused over. In wxPython, there are a couple of shortcuts that allow you to set this relationship up quickly and easily. Figure 16.3 displays this relationship in action using a page from the wxWidgets online documentation. The title of the window display is based on the web page title, and the status bar text, also comes from the HTML window.



Listing 16.3 displays the code used to produce figure 16.3.

Listing 16.3  Loading the HTMLWindow content from a web page

import wx

import wx.html

class MyHtmlFrame(wx.Frame):

def __init__(self, parent, title):

wx.Frame.__init__(self, parent, -1, title, size=(600,400))

self.CreateStatusBar()

html = wx.html.HtmlWindow(self)

if “gtk2” in wx.PlatformInfo:

html.SetStandardFonts()

html.SetRelatedFrame(self, self.GetTitle() + ” — %s”)

html.SetRelatedStatusBar(0)  

html.LoadPage(

“http://wxwidgets.org/manuals/2.5.4/wx_wxbutton.html”)

app = wx.PySimpleApp()

frm = MyHtmlFrame(None, “Simple HTML Browser”)

frm.Show()

app.MainLoop()

To set up the title bar relationship, use the method SetRelatedFrame(frame, format). The frame argument is the wx.Frame where you want the HTML window’s title information to display. The format argument is the string you want to display in the title bar of that frame. It can be any string you want, as long as it contains the pattern “%s” somewhere—that pattern is replaced by the <title> of the HTML page being displayed in the HTML window. A typical format argument would be something like “My wxPython Browser: %s”. When a page is loaded in the window, the frame title is automatically replaced with the new page information.

To set up the status bar, use the method SetRelatedStatusBar(bar). This method must be called after SetRelatedFrame(), as it associates the HTML window with an element in the status bar of the related frame. The bar argument is the display slot in the status bar that should be used to display the status information. Typically, this will be 0, but it could be different if the frame has its status bar configured to display multiple slots. If the bar argument is -1, no messages are displayed. Once this relationship is created, when the mouse moves over an active anchor in the HTML display, the URL being linked to is displayed in the status bar.


16.2.4 How can I print an HTML page?

Once the HTML is displayed on the screen, the next logical thing to do is print the HTML. The class for this is wx.html.HtmlEasyPrinting . You create an instance of wx.html.HtmlEasyPrinting with a simple constructor,

wx.html.HtmlEasyPrinting(name=”Printing”, parentWindow=None)

The parameters aren’t usually that important—the name parameter is just a string that is used for display in the various dialogs that the easy printing instance creates. If defined, the parentWindow is the parent window of those dialogs. If the parentWindow is None, the dialogs display at the top level. You should only create one instance of wx.html.HtmlEasyPrinting. Although this is not enforced by the wxPython system, the class is designed to be a singleton.

Using the class instance

With a name like wx.html.HtmlEasyPrinting, you’d expect that using the class would be easy. And it is. To start, you can show the user dialog boxes for settings with the methods PrinterSetup() and PageSetup(). Calling these methods causes the appropriate dialog to be displayed to the user. The easy printing instance stores these settings for later use, so that you don’t have to. If you want to access those data objects for your own specific handling, use the methods GetPrintData() and GetPageSetupData(). The GetPrintData() method returns a wx.PrintData object, and GetPageSetupData() returns a wx.PageSetupDialogData object, both of which are discussed in more detail in chapter 17.

Setting fonts

You can set the fonts for the printing using the method SetFonts(normal_face, fixed_face, sizes). This method behaves in the same way SetFonts() does for the HTML window (the settings in the print object do not affect the settings in the HTML window). You can set a page header or page footer to be printed on each page using the methods SetHeader(header, pg) and SetFooter(footer, pg). The header and footer arguments are the strings to be displayed. In the string, you can use the placeholder @PAGENUM@ which is replaced at runtime with the page number being printed. You can also use the placeholder @PAGESCNT@ which is the total number of pages being printed. You can use either placeholder as many times as you want. The pg parameter is one of the three constants wx.PAGE_ALL, wx.PAGE_EVEN, or wx.PAGE_ODD. The constant controls on which pages the header or footer displays. By calling this method more than once with different pg settings, you can set separate headers and footers for odd and even pages.

Previewing output

If you want to preview the output before printing, you can use PreviewFile(htmlfile). In this case the input is the name of a file local to your machine containing the HTML. Alternately, you can use the method PreviewText(htmlText, basepath=””). The htmltext is the actual HTML you want printed. The basepath is a path or URL to the location of the file which is used to resolve things like relative image paths. Both methods return Boolean True if the preview is successful, and Boolean False if not. If there is an error, the global method wx.Printer.GetLastError() will have more information about the error. More detailed information about that method is discussed in chapter 17.

Printing

Having read this far in the section about the easy printing method, you are probably wondering how to simply print an HTML page. The methods are PrintFile(htmlfile) and PrintText(htmlText, basepath). The arguments behave the same way as they do in the preview methods, with the exception that these methods actually print to the printer using the settings specified in the settings dialog. A True result indicates that, as far as wxPython was concerned, the printing was successful.


16.3 Extending the HTML window

In this section we’ll show you how to handle obscure HTML tags in the HTML window, how to invent your own tags, how to embed wxPython widgets in the HTML, how to handle other file formats, and how to create a real HTML browser in your application.

16.3.1 How does the HTML parser work?

The HTML window has its own internal parser within wxPython. Actually, there are two parser classes, but one of them is a refinement of the other. In general, working with parsers is only useful if you want to extend the functionality of the wx.html.HtmlWindow itself. If you are programming in Python and want to use an HTML parser for other purposes, we recommend using one of the two parser modules that are distributed with Python (htmllib and HTMLParser), or an external Python tool like Beautiful Soup. We’re only going to cover this enough to give you the basics needed to add your own tag type.

The two parser classes are wx.html.HtmlParser, which is the more generic parser, and wx.html.HtmlWinParser , which is a subclass of wx.html.HtmlParser, with extensions specifically created to support displaying text in a wx.html.HtmlWindow. Since we’re mostly concerned with HTML windows here, we’ll focus on the subclass.

To create an HTML parser, use one of two constructors. The basic one, wx.html.HtmlWinParser(), takes no arguments. The parent wx.html.HtmlParser class also has a no-argument constructor. You can associate a wx.html.HtmlWinParser() with an existing wx.html.HtmlWindow using the other constructor—wx.html.HtmlWinParser(wnd), where wnd is the instance of the HTML window.
To use the parser, the simplest way is to call the method Parse(source). The source parameter is the HTML string to be processed. The   return value is the parsed data. For a wx.html.HtmlWinParser, the return value is an instance of the class wx.html.HtmlCell.

The HTML parser converts the HTML text into a series of cells , where a cell is some meaningful fragment of the HTML. A cell can represent some text, an image, a table, a list, or any other specific element. The most significant subclass of wx. html.HtmlCell is wx.html.HtmlContainerCell, which is simply a cell that can contain other cells within it, such as a table, or a paragraph with different text styles. For nearly every document that you parse, the return value will be an wx.html.HtmlContainerCell .

Each cell contains a Draw(dc, x, y, view_y1, view_y2) method, which allows it to actually draw its information in the HTML window.
Another important cell subclass is wx.html.HtmlWidgetCell , which allows an arbitrary wxPython widget to be inserted into an HTML document just like any other cell. This can include any kind of widget used to manage HTML forms, but can also include static text used for formatted display. The only interesting method of wx.html.HtmlWidgetCell is the constructor.

wx.html.HtmlWidgetCell(wnd, w=0)

In the constructor, the wnd parameter is the wxPython widget to be drawn. The w parameter is a floating width. If it is not 0, it is an integer between 1 and 100, and then the width of the wnd widget is dynamically adjusted to be that percentage of the width of its parent container.

There are many other cell types that are used to display the more typical parts of an HTML document. For more information regarding these other cell types, refer to the wxWidgets documentation.

If you liked this excerpt from the book “wxPython in Action“, do check out the book.

wxPy in action

Related :
>> wxPython is a good answer for almost any kind of desktop application.

Content Team

The IndicThreads Content Team posts news about the latest and greatest in software development as well as content from IndicThreads' conferences and events. Track us social media @IndicThreads. Stay tuned!

Leave a Reply