No Fancy Terms: How shall we represent a web page?

When we start automation, how are we going to organize our code? How are we going to divide work among the individual team members?

A typical web app contains a lot of pages. So let’s start with this: How to represent a web page?

What elements we have in this page?
What actions can users take in this page?

How shall automation answer these 2 questions? What we use to represent elements? Class Variables. What we use to define an actions that users can do on the page? Methods.

What about Page Navigation? IE. how user comes to this page? Yes, we need to have method comeToThisPage() for navigation. To summarize:

  • We use class variables to represent web elements on the page
  • We use Methods that abstract the actions that user can do on
  • We use methods to define Page Navigation

But wait, what is exactly a page?
Does it have to be a physical page? Not necessarily. If the page is really big, long and your have to scroll down or up, left or right, feel free to break it down into several pages. We don’t want to big long class with a few thousand lines. We want slim code.

In another word, Page here is a logical concept. It refers a section of the page, could be the whole page of course, where all the elements works together to provide certain features to the user.

Any relatively independent section can be treated as a Page, like footer, header, sidebar, we can write separate class for each.

Also, when we write code to represent the page, we don’t have to cover all the elements/page behaviors on that page, instead, just cover the one that you are interested, aka just cover what your test cases are referred.

What about Shared UI Components
A typical modern web page is likely to contains the following components:

  • Navigation menu,
  • Side bar
  • Main Content
  • Header&Footer

Most time header&footer, navigation menu and sidebar are shared among all the pages. These shared components can be treated as separate pages and deserve a separate class.

Use METHODs to define Page Behaviors

This is straight-forward: To take this action, what info does user need to type in? what buttons does he need to click? And what are the expected behavior of out app after the action is taken? We define all these in the method. Use login as an example:

public void login(String userName, String password)
//Verify login is successful

Should we contain Page flow info in the method?
Google ‘page object pattern’ and you will find many articles advocate that page object class shall implement page flow info by returning next page object in the page object method, such as make login() to return HomePage:

public HomePage login(){
//code to simulate login here
return new HomePage();

This is nice, but usually once logged in, we have a ever-present navigation menu to go anywhere from current point. So this ‘coding the page flow info into the page object class’ doesn’t really apply most time. I personally prefer not including navigation info in each single action method. Instead I define all navigation information in a central Navigation class as following:

public class Navigation extends BasePage{
//… variable declaration part is intentionally skipped…

public HomePage gotoHomePage()
{ //your implementation here..
return new HomePage(driver);

public xxxPage gotoXxxPage()
{ //your implementation here..
return new XxxPage(driver);

[…more methods for more pages here…]

So in summary, this is how we represent a UI page:

  • Each UI page has a class to represent it. Ladies and Gentlemen, time to get fancy: this class is called Page Object Class
  • Each page object class defines page behaviors on this page
  • Put navigation info in a central, separate Navigation class

This whole way of representing a UI page, if put it in a fancy term, it is called Page fucking Object Model!