WebDriver drives a browser natively, as a user would, either locally
or on a remote machine using the Selenium server,
marks a leap forward in terms of browser automation.
Selenium WebDriver refers to both the language bindings
and the implementations of the individual browser controlling code.
This is commonly referred to as just WebDriver.
WebDriver is designed as a simple
and more concise programming interface.
WebDriver is a compact object-oriented API.
It drives the browser effectively.
1 - Getting started
If you are new to Selenium, we have a few resources that can help you get up to speed right away.
Selenium supports automation of all the major browsers in the market
through the use of WebDriver.
WebDriver is an API and protocol that defines a language-neutral interface
for controlling the behaviour of web browsers.
Each browser is backed by a specific WebDriver implementation, called a driver.
The driver is the component responsible for delegating down to the browser,
and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors
take responsibility for the implementation for their browsers.
Selenium makes use of these third party drivers where possible,
but also provides its own drivers maintained by the project
for the cases when this is not a reality.
The Selenium framework ties all of these pieces together
through a user-facing interface that enables the different browser backends
to be used transparently,
enabling cross-browser and cross-platform automation.
Selenium setup is quite different from the setup of other commercial tools.
Before you can start writing Selenium code, you have to
install the language bindings libraries for your language of choice, the browser you
want to use, and the driver for that browser.
Follow the links below to get up and going with Selenium WebDriver.
If you wish to start with a low-code/record and playback tool, please check
Selenium IDE
Once you get things working, if you want to scale up your tests, check out the
Selenium Grid.
1.1 - Install a Selenium library
Setting up the Selenium library for your favourite programming language.
First you need to install the Selenium bindings for your automation project.
The installation process for libraries depends on the language you choose to use.
Make sure you check the Selenium downloads page to make sure
you are using the latest version.
Further items of note for using Visual Studio Code (vscode) and C#
Install the compatible .NET SDK as per the section above.
Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet.
Follow the instruction here
to create and run the “Hello World” console project using C#.
You may also create a NUnit starter project using the command line dotnet new NUnit.
Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues.
If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects.
Add the following section to the file nuget.config if it is empty:
For more info about nuget.configclick here.
You may have to customize nuget.config to meet you needs.
Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver.
Press Enter and select the version.
Now you can use the examples in the documentation related to C# with vscode.
You can see the minimum required version of Ruby for any given Selenium version
on rubygems.org
Note: if you get an error about drivers not found, please read about troubleshooting the
driver location error
Eight Basic Components
Everything Selenium does is send the browser commands to do something or send requests for information.
Most of what you’ll do with Selenium is a combination of these basic commands:
1. Start the session
For more details on starting a session read our documentation on driver sessions
Synchronizing the code with the current state of the browser is one of the biggest challenges
with Selenium, and doing it well is an advanced topic.
Essentially you want to make sure that the element is on the page before you attempt to locate it
and the element is in an interactable state before you attempt to interact with it.
An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so
we’ll use it as a placeholder.
If you are using Selenium for testing,
you will want to execute your Selenium code using test runner tools.
Many of the code examples in this documentation can be found in our example repositories.
There are multiple options in each language, but here is what we are using in our examples:
// Add instructions
//Addinstructions
// Add instructions
//Addinstructions
Install Mocha Test runner using below command in your terminal
Install with npm globally:
npm install -g mocha
or as a development dependency for your project:
npm install --save-dev mocha
and run your tests using below command
mocha firstScript.spec.js
// Add instructions
Next Steps
Take what you’ve learned and build out your Selenium code.
As you find more functionality that you need, read up on the rest of our
WebDriver documentation.
1.3 - Upgrade to Selenium 4
Are you still using Selenium 3? This guide will help you upgrade to the latest release!
Upgrading to Selenium 4 should be a painless process if you are using one of the officially
supported languages (Ruby, JavaScript, C#, Python, and Java). There might be some cases where
a few issues can happen, and this guide will help you to sort them out. We will go through
the steps to upgrade your project dependencies and understand the major deprecations and
changes the version upgrade brings.
These are the steps we will follow to upgrade to Selenium 4:
Preparing our test code
Upgrading dependencies
Potential errors and deprecation messages
Note: while Selenium 3.x versions were being developed, support for the W3C WebDriver standard
was implemented. Both this new protocol and the legacy JSON Wire Protocol were supported. Around
version 3.11, Selenium code became compliant with the level W3C 1 specification. The W3C compliant
code in the latest version of Selenium 3 will work as expected in Selenium 4.
Preparing our test code
Selenium 4 removes support for the legacy protocol and uses the W3C WebDriver standard by
default under the hood. For most things, this implementation will not affect end users.
The major exceptions are Capabilities and the Actions class.
Capabilities
If the test capabilities are not structured to be W3C compliant, may cause a session to not
be started. Here is the list of W3C WebDriver standard capabilities:
browserName
browserVersion (replaces version)
platformName (replaces platform)
acceptInsecureCerts
pageLoadStrategy
proxy
timeouts
unhandledPromptBehavior
An up-to-date list of standard capabilities can be found at
W3C WebDriver.
Any capability that is not contained in the list above, needs to include a vendor prefix.
This applies to browser specific capabilities as well as cloud vendor specific capabilities.
For example, if your cloud vendor uses build and name capabilities for your tests, you need
to wrap them in a cloud:options block (check with your cloud vendor for the appropriate prefix).
The utility methods to find elements in the Java bindings (FindsBy interfaces) have been removed
as they were meant for internal use only. The following code samples explain this better.
Check the subsections below to install Selenium 4 and have your project dependencies upgraded.
Java
The process of upgrading Selenium depends on which build tool is being used. We will cover the
most common ones for Java, which are Maven and
Gradle. The minimum Java version required is still 8.
Maven
Before
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>3.141.59</version></dependency><!-- more dependencies ... --></dependencies>
After
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>4.4.0</version></dependency><!-- more dependencies ... --></dependencies>
After making the change, you could execute mvn clean compile on the same directory where the
pom.xml file is.
Gradle
Before
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '3.141.59'
}
test {
useJUnitPlatform()
}
After
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.4.0'
}
test {
useJUnitPlatform()
}
After making the change, you could execute ./gradlew clean build
on the same directory where the build.gradle file is.
To check all the Java releases, you can head to MVNRepository.
C#
The place to get updates for Selenium 4 in C# is NuGet. Under the
Selenium.WebDriver package you
can get the instructions to update to the latest version. Inside of Visual Studio, through the
NuGet Package Manager you can execute:
The most important change to use Python is the minimum required version. Selenium 4 will
require a minimum Python 3.7 or higher. More details can be found at the
Python Package Index. To upgrade from the
command line, you can execute:
pip install selenium==4.4.3
Ruby
The update details for Selenium 4 can be seen at the
selenium-webdriver
gem in RubyGems. To install the latest version, you can execute:
gem install selenium-webdriver
To add it to your Gemfile:
gem 'selenium-webdriver', '~> 4.4.0'
JavaScript
The selenium-webdriver package can be found at the Node package manager,
npmjs. Selenium 4 can be found
here. To install it, you
could either execute:
Waits are also expecting different parameters now. WebDriverWait is now expecting a Duration
instead of a long for timeout in seconds and milliseconds. The withTimeout and pollingEvery
utility methods from FluentWait have switched from expecting (long time, TimeUnit unit) to
expect (Duration duration).
Merging capabilities is no longer changing the calling object
It was possible to merge a different set of capabilities into another set, and it was
mutating the calling object. Now, the result of the merge operation needs to be assigned.
Before
MutableCapabilitiescapabilities=newMutableCapabilities();capabilities.setCapability("platformVersion","Windows 10");FirefoxOptionsoptions=newFirefoxOptions();options.setHeadless(true);options.merge(capabilities);// As a result, the `options` object was getting modified.
After
MutableCapabilitiescapabilities=newMutableCapabilities();capabilities.setCapability("platformVersion","Windows 10");FirefoxOptionsoptions=newFirefoxOptions();options.setHeadless(true);options=options.merge(capabilities);// The result of the `merge` call needs to be assigned to an object.
Firefox Legacy
Before GeckoDriver was around, the Selenium project had a driver implementation to automate
Firefox (version <48). However, this implementation is not needed anymore as it does not work
in recent versions of Firefox. To avoid major issues when upgrading to Selenium 4, the setLegacy
option will be shown as deprecated. The recommendation is to stop using the old implementation
and rely only on GeckoDriver. The following code will show the setLegacy line deprecated after
upgrading.
Instead of it, AddAdditionalOption is recommended. Here is an example showing this:
Before
var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalCapability("cloud:options", cloudOptions, true);
After
var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalOption("cloud:options", cloudOptions);
Python
executable_path has been deprecated, please pass in a Service object
In Selenium 4, you’ll need to set the driver’s executable_path from a Service object to prevent deprecation warnings. (Or don’t set the path and instead make sure that the driver you need is on the System PATH.)
We went through the major changes to be taken into consideration when upgrading to Selenium 4.
Covering the different aspects to cover when test code is prepared for the upgrade, including
suggestions on how to prevent potential issues that can show up when using the new version of
Selenium. To finalize, we also covered a set of possible issues that you can bump into after
upgrading, and we shared potential fixes for those issues.
The primary unique argument for starting a remote driver includes information about where to execute the code.
Read the details in the Remote Driver Section
In Selenium 3, capabilities were defined in a session by using Desired Capabilities classes.
As of Selenium 4, you must use the browser options classes.
For remote driver sessions, a browser options instance is required as it determines which browser will be used.
These options are described in the w3c specification for Capabilities.
Each browser has custom options that may be defined in addition to the ones defined in the specification.
browserName
This capability is used to set the browserName for a given session.
If the specified browser is not installed at the
remote end, the session creation will fail.
browserVersion
This capability is optional, this is used to
set the available browser version at remote end.
For Example, if ask for Chrome version 75 on a system that
only has 80 installed, the session creation will fail.
pageLoadStrategy
Three types of page load strategies are available.
The page load strategy queries the
document.readyState
as described in the table below:
Strategy
Ready State
Notes
normal
complete
Used by default, waits for all resources to download
eager
interactive
DOM access is ready, but other resources like images may still be loading
none
Any
Does not block WebDriver at all
The document.readyState property of a document describes the loading state of the current document.
When navigating to a new page via URL, by default, WebDriver will hold off on completing a navigation
method (e.g., driver.navigate().get()) until the document ready state is complete. This does not
necessarily mean that the page has finished loading, especially for sites like Single Page Applications
that use JavaScript to dynamically load content after the Ready State returns complete. Note also
that this behavior does not apply to navigation that is a result of clicking an element or submitting a form.
If a page takes a long time to load as a result of downloading assets (e.g., images, css, js)
that aren’t important to the automation, you can change from the default parameter of normal to
eager or none to speed up the session. This value applies to the entire session, so make sure
that your waiting strategy is sufficient to minimize
flakiness.
normal (default)
WebDriver waits until the load
event fire is returned.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
it('Navigate using normal page loading strategy',asyncfunction(){letdriver=awaitenv.builder().setChromeOptions(options.setPageLoadStrategy('normal')).build();awaitdriver.get('https://www.google.com');
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
WebDriver only waits until the initial page is downloaded.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
This identifies the operating system at the remote-end,
fetching the platformName returns the OS name.
In cloud-based providers,
setting platformName sets the OS at the remote-end.
acceptInsecureCerts
This capability checks whether an expired (or)
invalid TLS Certificate is used while navigating
during a session.
If the capability is set to false, an
insecure certificate error
will be returned as navigation encounters any domain
certificate problems. If set to true, invalid certificate will be
trusted by the browser.
All self-signed certificates will be trusted by this capability by default.
Once set, acceptInsecureCerts capability will have an
effect for the entire session.
timeouts
A WebDriver session is imposed with a certain session timeout
interval, during which the user can control the behaviour
of executing scripts or retrieving information from the browser.
Each session timeout is configured with
combination of different timeouts as described below:
Script Timeout
Specifies when to interrupt an executing script in
a current browsing context. The default timeout 30,000
is imposed when a new session is created by WebDriver.
Page Load Timeout
Specifies the time interval in which web page
needs to be loaded in a current browsing context.
The default timeout 300,000 is imposed when a
new session is created by WebDriver. If page load limits
a given/default time frame, the script will be stopped by
TimeoutException.
Implicit Wait Timeout
This specifies the time to wait for the
implicit element location strategy when
locating elements. The default timeout 0
is imposed when a new session is created by WebDriver.
unhandledPromptBehavior
Specifies the state of current session’s user prompt handler.
Defaults to dismiss and notify state
User Prompt Handler
This defines what action must take when a
user prompt encounters at the remote-end. This is defined by
unhandledPromptBehavior capability and has the following states:
This new capability indicates if strict interactability checks
should be applied to input type=file elements. As strict interactability
checks are off by default, there is a change in behaviour
when using Element Send Keys with hidden file upload controls.
proxy
A proxy server acts as an intermediary for
requests between a client and a server. In simple,
the traffic flows through the proxy server
on its way to the address you requested and back.
A proxy server for automation scripts
with Selenium could be helpful for:
Capture network traffic
Mock backend calls made by the website
Access the required website under complex network
topologies or strict corporate restrictions/policies.
If you are in a corporate environment, and a
browser fails to connect to a URL, this is most
likely because the environment needs a proxy to be accessed.
Selenium WebDriver provides a way to proxy settings:
The Service classes are for managing the starting and stopping of local drivers.
They cannot be used with a Remote WebDriver session.
Service classes allow you to specify information about the driver,
like location and which port to use.
They also let you specify what arguments get passed
to the command line. Most of the useful arguments are related to logging.
Default Service instance
To start a driver with a default service instance:
Note: If you are using Selenium 4.6 or greater, you shouldn’t need to set a driver location.
If you cannot update Selenium or have an advanced use case, here is how to specify the driver location:
Logging functionality varies between browsers. Most browsers allow you to
specify location and level of logs. Take a look at the respective browser page:
You can use WebDriver remotely the same way you would use it
locally. The primary difference is that a remote WebDriver needs to be
configured so that it can run your tests on a separate machine.
A remote WebDriver is composed of two pieces: a client and a
server. The client is your WebDriver test and the server is simply a
Java servlet, which can be hosted in any modern JEE app server.
To run a remote WebDriver client, we first need to connect to the RemoteWebDriver.
We do this by pointing the URL to the address of the server running our tests.
In order to customize our configuration, we set desired capabilities.
Below is an example of instantiating a remote WebDriver object
pointing to our remote web server, www.example.com,
running our tests on Firefox.
The Local File Detector allows the transfer of files from the client
machine to the remote server. For example, if a test needs to upload a
file to a web application, a remote WebDriver can automatically transfer
the file from the local machine to the remote web server during
runtime. This allows the file to be uploaded from the remote machine
running the test. It is not enabled by default and can be enabled in
the following way:
This feature is only available for Java client binding (Beta onwards). The Remote WebDriver client sends requests to the Selenium Grid server, which passes them to the WebDriver. Tracing should be enabled at the server and client-side to trace the HTTP requests end-to-end. Both ends should have a trace exporter setup pointing to the visualization framework.
By default, tracing is enabled for both client and server.
To set up the visualization framework Jaeger UI and Selenium Grid 4, please refer to Tracing Setup for the desired version.
For client-side setup, follow the steps below.
Add the required dependencies
Installation of external libraries for tracing exporter can be done using Maven.
Add the opentelemetry-exporter-jaeger and grpc-netty dependency in your project pom.xml:
Each browser has custom capabilities and unique features.
3.1 - Chrome specific functionality
These are capabilities and features specific to Google Chrome browsers.
By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of
the Chrome browser and the version of chromedriver must match the major version.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
Chromedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property;
Property key: ChromeDriverService.CHROME_DRIVER_LOG_PROPERTY
Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property:
Property key: ChromeDriverService.CHROME_DRIVER_LOG_LEVEL_PROPERTY
Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property:
Property keys: ChromeDriverService.CHROME_DRIVER_APPEND_LOG_PROPERTY and ChromeDriverService.CHROME_DRIVER_READABLE_TIMESTAMP
Property value: "true" or "false"
Chromedriver and Chrome browser versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Chrome.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property:
Property key: ChromeDriverService.CHROME_DRIVER_DISABLE_BUILD_CHECK
Property value: "true" or "false"
See the Chrome DevTools section for more information about using Chrome DevTools
3.2 - Edge specific functionality
These are capabilities and features specific to Microsoft Edge browsers.
Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome,
the major version number of edgedriver must match the major version of the Edge browser.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
Add a browser location to options:
Coding Help
Note:
This section could use some updated code examples
MSEdgedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property;
Property key: EdgeDriverService.EDGE_DRIVER_LOG_PROPERTY
Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property:
Property key: EdgeDriverService.EDGE_DRIVER_LOG_LEVEL_PROPERTY
Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property:
Property keys: EdgeDriverService.EDGE_DRIVER_APPEND_LOG_PROPERTY and EdgeDriverService.EDGE_DRIVER_READABLE_TIMESTAMP
Property value: "true" or "false"
Edge browser and msedgedriver versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Edge.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property:
Property key: EdgeDriverService.EDGE_DRIVER_DISABLE_BUILD_CHECK
Property value: "true" or "false"
Microsoft Edge can be driven in “Internet Explorer Compatibility Mode”, which uses
the Internet Explorer Driver classes in conjunction with Microsoft Edge.
Read the Internet Explorer page for more details.
Special Features
Some browsers have implemented additional features that are unique to them.
Casting
You can drive Chrome Cast devices with Edge, including sharing tabs
Coding Help
Note:
This section could use some updated code examples
The args parameter is for a list of Command line switches used when starting the browser.
Commonly used args include -headless and "-profile", "/path/to/profile"
The binary parameter takes the path of an alternate location of browser to use. For example, with this parameter you can
use geckodriver to drive Firefox Nightly instead of the production version when both are present on your computer.
Add a browser location to options:
Coding Help
Note:
This section could use some updated code examples
const{Builder}=require("selenium-webdriver");constfirefox=require('selenium-webdriver/firefox');constoptions=newfirefox.Options();letprofile='/path to custom profile';options.setProfile(profile);constdriver=newBuilder().forBrowser('firefox').setFirefoxOptions(options).build();
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property;
Property key: GeckoDriverService.GECKO_DRIVER_LOG_PROPERTY
Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property:
Property key: GeckoDriverService.GECKO_DRIVER_LOG_LEVEL_PROPERTY
Property value: String representation of FirefoxDriverLogLevel enum
The driver logs everything that gets sent to it, including string representations of large binaries, so
Firefox truncates lines by default. To turn off truncation:
Note: Java also allows setting log level by System Property:
Property key: GeckoDriverService.GECKO_DRIVER_LOG_NO_TRUNCATE
Property value: "true" or "false"
The default directory for profiles is the system temporary directory. If you do not have access to that directory,
or want profiles to be created some place specific, you can change the profile root directory:
When working with an unfinished or unpublished extension, it will likely not be signed. As such, it can only
be installed as “temporary.” This can be done by passing in either a zip file or a directory, here’s an
example with a directory:
These are capabilities and features specific to Microsoft Internet Explorer browsers.
As of June 2022, Selenium officially no longer supports standalone Internet Explorer.
The Internet Explorer driver still supports running Microsoft Edge in “IE Compatibility Mode.”
Special considerations
The IE Driver is the only driver maintained by the Selenium Project directly.
While binaries for both the 32-bit and 64-bit
versions of Internet Explorer are available, there are some
known limitations
with the 64-bit driver. As such it is recommended to use the 32-bit driver.
Additional information about using Internet Explorer can be found on the
IE Driver Server page
Options
Starting a Microsoft Edge browser in Internet Explorer Compatibility mode with basic defined options looks like this:
If IE is not present on the system (default in Windows 11), you do not need to
use the two parameters above. IE Driver will use Edge and will automatically locate it.
If IE and Edge are both present on the system, you only need to set attaching to Edge,
IE Driver will automatically locate Edge on your system.
<p><ahref=https://github.com/SeleniumHQ/seleniumhq.github.io/tree/trunk/examples>
<spanclass="selenium-badge-code"data-bs-toggle="tooltip"data-bs-placement="right"title="Code examples are placed in the examples directory; see about section for contribution and style guides">MoveCode</span></a></p>valoptions=InternetExplorerOptions()valdriver=InternetExplorerDriver(options)
Here are a few common use cases with different capabilities:
fileUploadDialogTimeout
In some environments, Internet Explorer may timeout when opening the
File Upload dialog. IEDriver has a default timeout of 1000ms, but you
can increase the timeout using the fileUploadDialogTimeout capability.
When set to true, this capability clears the Cache,
Browser History and Cookies for all running instances
of InternetExplorer including those started manually
or by the driver. By default, it is set to false.
Using this capability will cause performance drop while
launching the browser, as the driver will wait until the cache
gets cleared before launching the IE browser.
This capability accepts a Boolean value as parameter.
InternetExplorer driver expects the browser zoom level to be 100%,
else the driver will throw an exception. This default behaviour
can be disabled by setting the ignoreZoomSetting to true.
This capability accepts a Boolean value as parameter.
Whether to skip the Protected Mode check while launching
a new IE session.
If not set and Protected Mode settings are not same for
all zones, an exception will be thrown by the driver.
If capability is set to true, tests may
become flaky, unresponsive, or browsers may hang.
However, this is still by far a second-best choice,
and the first choice should always be to actually
set the Protected Mode settings of each zone manually.
If a user is using this property,
only a “best effort” at support will be given.
This capability accepts a Boolean value as parameter.
<p><ahref=https://github.com/SeleniumHQ/seleniumhq.github.io/tree/trunk/examples><spanclass="selenium-badge-code"data-bs-toggle="tooltip"data-bs-placement="right"title="Code examples are added to the projects in examples directory of repo; see about section for contribution and style guides">AddExample</span></a></p>
Internet Explorer includes several command-line options
that enable you to troubleshoot and configure the browser.
The following describes few supported command-line options
-private : Used to start IE in private browsing mode. This works for IE 8 and later versions.
-k : Starts Internet Explorer in kiosk mode.
The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.
-extoff : Starts IE in no add-on mode.
This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.
Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property;
Property key: InternetExplorerDriverService.IE_DRIVER_LOGFILE_PROPERTY
Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property:
Property key: InternetExplorerDriverService.IE_DRIVER_LOGLEVEL_PROPERTY
Property value: String representation of InternetExplorerDriverLogLevel.DEBUG.toString() enum
These are capabilities and features specific to Apple Safari browsers.
Unlike Chromium and Firefox drivers, the safaridriver is installed with the Operating System.
To enable automation on Safari, run the following command from the terminal:
safaridriver --enable
Options
Capabilities common to all browsers are described on the Options page.
Those looking to automate Safari on iOS should look to the Appium project.
Service
Service settings common to all browsers are described on the Service page.
Logging
Unlike other browsers, Safari doesn’t let you choose where logs are output, or change levels. The one option
available is to turn logs off or on. If logs are toggled on, they can be found at:~/Library/Logs/com.apple.WebDriver/.
Note: Java also allows setting console output by System Property;
Property key: SafariDriverService.SAFARI_DRIVER_LOGGING
Property value: "true" or "false"
WebDriver can generally be said to have a blocking API.
Because it is an out-of-process library that
instructs the browser what to do,
and because the web platform has an intrinsically asynchronous nature,
WebDriver does not track the active, real-time state of the DOM.
This comes with some challenges that we will discuss here.
From experience,
most intermittent issues that arise from use of Selenium and WebDriver
are connected to race conditions that occur between
the browser and the user’s instructions.
An example could be that the user instructs the browser to navigate to a page,
then gets a no such element error
when trying to find an element.
Consider the following document:
<!doctype html><metacharset=utf-8><title>Race Condition Example</title><script>varinitialised=false;window.addEventListener("load",function(){varnewElement=document.createElement("p");newElement.textContent="Hello from JavaScript!";document.body.appendChild(newElement);initialised=true;});</script>
The WebDriver instructions might look innocent enough:
driver.get("file:///race_condition.html");WebElementelement=driver.findElement(By.tagName("p"));assertEquals(element.getText(),"Hello from JavaScript!");
driver.navigate("file:///race_condition.html")el=driver.find_element(By.TAG_NAME,"p")assertel.text=="Hello from JavaScript!"
driver.Navigate().GoToUrl("file:///race_condition.html");IWebElementelement=driver.FindElement(By.TagName("p"));assertEquals(element.Text,"Hello from JavaScript!");
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxbegin# Navigate to URLdriver.get'file:///race_condition.html'# Get and store Paragraph Textsearch_form=driver.find_element(:css,'p').text"Hello from JavaScript!".eql?search_formensuredriver.quitend
awaitdriver.get('file:///race_condition.html');constelement=driver.findElement(By.css('p'));assert.strictEqual(awaitelement.getText(),'Hello from JavaScript!');
driver.get("file:///race_condition.html")valelement=driver.findElement(By.tagName("p"))assert(element.text=="Hello from JavaScript!")
The issue here is that the default
page load strategy
used in WebDriver listens for the document.readyState
to change to "complete" before returning from the call to navigate.
Because the p element is
added after the document has completed loading,
this WebDriver script might be intermittent.
It “might” be intermittent because no guarantees can be made
about elements or events that trigger asynchronously
without explicitly waiting—or blocking—on those events.
Fortunately, the normal instruction set available on
the WebElement interface—such
as WebElement.click and WebElement.sendKeys—are
guaranteed to be synchronous,
in that the function calls will not return
(or the callback will not trigger in callback-style languages)
until the command has been completed in the browser.
The advanced user interaction APIs,
Keyboard
and Mouse,
are exceptions as they are explicitly intended as
“do what I say” asynchronous commands.
Waiting is having the automated task execution
elapse a certain amount of time before continuing with the next step.
To overcome the problem of race conditions
between the browser and your WebDriver script,
most Selenium clients ship with a wait package.
When employing a wait,
you are using what is commonly referred to
as an explicit wait.
Explicit wait
Explicit waits are available to Selenium clients
for imperative, procedural languages.
They allow your code to halt program execution,
or freeze the thread,
until the condition you pass it resolves.
The condition is called with a certain frequency
until the timeout of the wait is elapsed.
This means that for as long as the condition returns a falsy value,
it will keep trying and waiting.
Since explicit waits allow you to wait for a condition to occur,
they make a good fit for synchronising the state between the browser and its DOM,
and your WebDriver script.
To remedy our buggy instruction set from earlier,
we could employ a wait to have the findElement call
wait until the dynamically added element from the script
has been added to the DOM:
WebDriverdriver=newChromeDriver();driver.get("https://google.com/ncr");driver.findElement(By.name("q")).sendKeys("cheese"+Keys.ENTER);// Initialize and wait till element(link) became clickable - timeout in 10 seconds
WebElementfirstResult=newWebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")));// Print the first result
System.out.println(firstResult.getText());
fromselenium.webdriver.support.waitimportWebDriverWaitdefdocument_initialised(driver):returndriver.execute_script("return initialised")driver.navigate("file:///race_condition.html")WebDriverWait(driver,timeout=10).until(document_initialised)el=driver.find_element(By.TAG_NAME,"p")assertel.text=="Hello from JavaScript!"
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxwait=Selenium::WebDriver::Wait.new(:timeout=>10)defdocument_initialised(driver)driver.execute_script('return initialised')endbegindriver.get'file:///race_condition.html'wait.until{document_initialiseddriver}search_form=driver.find_element(:css,'p').text"Hello from JavaScript!".eql?search_formensuredriver.quitend
constdocumentInitialised=()=>driver.executeScript('return initialised');awaitdriver.get('file:///race_condition.html');awaitdriver.wait(()=>documentInitialised(),10000);constelement=driver.findElement(By.css('p'));assert.strictEqual(awaitelement.getText(),'Hello from JavaScript!');
driver.get("https://google.com/ncr")driver.findElement(By.name("q")).sendKeys("cheese"+Keys.ENTER)// Initialize and wait till element(link) became clickable - timeout in 10 seconds
valfirstResult=WebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")))// Print the first result
println(firstResult.text)
We pass in the condition as a function reference
that the wait will run repeatedly until its return value is truthy.
A “truthful” return value is anything that evaluates to boolean true
in the language at hand, such as a string, number, a boolean,
an object (including a WebElement),
or a populated (non-empty) sequence or list.
That means an empty list evaluates to false.
When the condition is truthful and the blocking wait is aborted,
the return value from the condition becomes the return value of the wait.
With this knowledge,
and because the wait utility ignores no such element errors by default,
we can refactor our instructions to be more concise:
WebElementfoo=newWebDriverWait(driver,Duration.ofSeconds(3)).until(driver->driver.findElement(By.name("q")));assertEquals(foo.getText(),"Hello from JavaScript!");
fromselenium.webdriver.support.waitimportWebDriverWaitdriver.navigate("file:///race_condition.html")el=WebDriverWait(driver,timeout=3).until(lambdad:d.find_element(By.TAG_NAME,"p"))assertel.text=="Hello from JavaScript!"
using(vardriver=newFirefoxDriver()){varfoo=newWebDriverWait(driver,TimeSpan.FromSeconds(3)).Until(drv=>drv.FindElement(By.Name("q")));Debug.Assert(foo.Text.Equals("Hello from JavaScript!"));}
driver.get'file:///race_condition.html'wait=Selenium::WebDriver::Wait.new(:timeout=>10)ele=wait.until{driver.find_element(css:'p')}foo=ele.textassert_matchfoo,'Hello from JavaScript'
letele=awaitdriver.wait(until.elementLocated(By.css('p')),10000);letfoo=awaitele.getText();assert(foo=="Hello from JavaScript");
driver.get("file:///race_condition.html")valele=WebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.presenceOfElementLocated(By.tagName("p")))assert(ele.text=="Hello from JavaScript!")
In that example, we pass in an anonymous function
(but we could also define it explicitly as we did earlier so it may be reused).
The first and only argument that is passed to our condition
is always a reference to our driver object, WebDriver.
In a multi-threaded environment, you should be careful
to operate on the driver reference passed in to the condition
rather than the reference to the driver in the outer scope.
Because the wait will swallow no such element errors
that are raised when the element is not found,
the condition will retry until the element is found.
Then it will take the return value, a WebElement,
and pass it back through to our script.
If the condition fails,
e.g. a truthful return value from the condition is never reached,
the wait will throw/raise an error/exception called a timeout error.
Options
The wait condition can be customised to match your needs.
Sometimes it is unnecessary to wait the full extent of the default timeout,
as the penalty for not hitting a successful condition can be expensive.
The wait lets you pass in an argument to override the timeout:
Because it is quite a common occurrence
to have to synchronise the DOM and your instructions,
most clients also come with a set of predefined expected conditions.
As might be obvious by the name,
they are conditions that are predefined for frequent wait operations.
The conditions available in the different language bindings vary,
but this is a non-exhaustive list of a few:
alert is present
element exists
element is visible
title contains
title is
element staleness
visible text
You can refer to the API documentation for each client binding
to find an exhaustive list of expected conditions:
There is a second type of wait that is distinct from
explicit wait called implicit wait.
By implicitly waiting, WebDriver polls the DOM
for a certain duration when trying to find any element.
This can be useful when certain elements on the webpage
are not available immediately and need some time to load.
Implicit waiting for elements to appear is disabled by default
and will need to be manually enabled on a per-session basis.
Mixing explicit waits and implicit waits
will cause unintended consequences, namely waits sleeping for the maximum
time even if the element is available or condition is true.
Warning:
Do not mix implicit and explicit waits.
Doing so can cause unpredictable wait times.
For example, setting an implicit wait of 10 seconds
and an explicit wait of 15 seconds
could cause a timeout to occur after 20 seconds.
An implicit wait is to tell WebDriver to poll the DOM
for a certain amount of time when trying to find an element or elements
if they are not immediately available.
The default setting is 0, meaning disabled.
Once set, the implicit wait is set for the life of the session.
FluentWait instance defines the maximum amount of time to wait for a condition,
as well as the frequency with which to check the condition.
Users may configure the wait to ignore specific types of exceptions whilst waiting,
such as NoSuchElementException when searching for an element on the page.
// Waiting 30 seconds for an element to be present on the page, checking
// for its presence once every 5 seconds.
Wait<WebDriver>wait=newFluentWait<WebDriver>(driver).withTimeout(Duration.ofSeconds(30)).pollingEvery(Duration.ofSeconds(5)).ignoring(NoSuchElementException.class);WebElementfoo=wait.until(driver->{returndriver.findElement(By.id("foo"));});
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxexception=Selenium::WebDriver::Error::NoSuchElementErrorbegindriver.get'http://somedomain/url_that_delays_loading'wait=Selenium::WebDriver::Wait.new(timeout:30,interval:5,message:'Timed out after 30 sec',ignore:exception)foo=wait.until{driver.find_element(id:'foo')}ensuredriver.quitend
const{Builder,until}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('firefox').build();awaitdriver.get('http://somedomain/url_that_delays_loading');// Waiting 30 seconds for an element to be present on the page, checking
// for its presence once every 5 seconds.
letfoo=awaitdriver.wait(until.elementLocated(By.id('foo')),30000,'Timed out after 30 seconds',5000);})();
Identifying and working with element objects in the DOM.
The majority of most people’s Selenium code involves working with web elements.
5.1 - File Upload
The file upload dialog could be handled using Selenium,
when the input element is of type file.
An example of it, could be found on this
web page- https://the-internet.herokuapp.com/upload
We will require to have a file available with us,
which we need to upload.
The code to upload the file for different programming
languages will be as follows -
importjava.util.concurrent.TimeUnit;importorg.openqa.selenium.By;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeDriver;classfileUploadDoc{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();driver.manage().timeouts().implicitlyWait(10,TimeUnit.SECONDS);driver.get("https://the-internet.herokuapp.com/upload");//we want to import selenium-snapshot file.
driver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg");driver.findElement(By.id("file-submit")).submit();if(driver.getPageSource().contains("File Uploaded!")){System.out.println("file uploaded");}else{System.out.println("file not uploaded");}driver.quit();}}
fromseleniumimportwebdriverdriver.implicitly_wait(10)driver.get("https://the-internet.herokuapp.com/upload");driver.find_element(By.ID,"file-upload").send_keys("selenium-snapshot.jpg")driver.find_element(By.ID,"file-submit").submit()if(driver.page_source.find("File Uploaded!")):print("file upload success")else:print("file upload not successful")driver.quit()
usingSystem;usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceSeleniumDocumentation.SeleniumPRs{classFileUploadExample{staticvoidMain(String[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://the-internet.herokuapp.com/upload");driver.FindElement(By.Id("file-upload")).SendKeys("selenium-snapshot.jpg");driver.FindElement(By.Id("file-submit")).Submit();if(driver.PageSource.Contains("File Uploaded!")){Console.WriteLine("file uploaded");}else{Console.WriteLine("file not uploaded");}driver.Quit();}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromedriver.get("https://the-internet.herokuapp.com/upload")driver.find_element(:id,"file-upload").send_keys("selenium-snapshot.jpg")driver.find_element(:id,"file-submit").submit()ifdriver.page_source().include?"File Uploaded!"puts"file upload success"elseputs"file upload not successful"end
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()driver.get("https://the-internet.herokuapp.com/upload")driver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg")driver.findElement(By.id("file-submit")).submit()if(driver.pageSource.contains("File Uploaded!")){println("file uploaded")}else{println("file not uploaded")}}
So the above example code helps to understand
how we can upload a file using Selenium.
5.2 - Locator strategies
Ways to identify one or more specific elements in the DOM.
A locator is a way to identify elements on a page. It is the argument passed to the
Finding element methods.
Check out our encouraged test practices for tips on
locators, including which to use when and
why to declare locators separately from the finding methods.
Traditional Locators
Selenium provides support for these 8 traditional location strategies in WebDriver:
Locator
Description
class name
Locates elements whose class name contains the search value (compound class names are not permitted)
css selector
Locates elements matching a CSS selector
id
Locates elements whose ID attribute matches the search value
name
Locates elements whose NAME attribute matches the search value
link text
Locates anchor elements whose visible text matches the search value
partial link text
Locates anchor elements whose visible text contains the search value. If multiple elements are matching, only the first one will be selected.
tag name
Locates elements whose tag name matches the search value
xpath
Locates elements matching an XPath expression
Creating Locators
To work on a web element using Selenium, we need to first locate it on the web page.
Selenium provides us above mentioned ways, using which we can locate element on the
page. To understand and create locator we will use the following HTML snippet.
<html><body><style>.information{background-color:white;color:black;padding:10px;}</style><h2>Contact Selenium</h2><formaction="/action_page.php"><inputtype="radio"name="gender"value="m"/>Male <inputtype="radio"name="gender"value="f"/>Female <br><br><labelfor="fname">First name:</label><br><inputclass="information"type="text"id="fname"name="fname"value="Jane"><br><br><labelfor="lname">Last name:</label><br><inputclass="information"type="text"id="lname"name="lname"value="Doe"><br><br><labelfor="newsletter">Newsletter:</label><inputtype="checkbox"name="newsletter"value="1"/><br><br><inputtype="submit"value="Submit"></form><p>To know more about Selenium, visit the official page
<ahref ="www.selenium.dev">Selenium Official Page</a></p></body></html>
class name
The HTML page web element can have attribute class. We can see an example in the
above shown HTML snippet. We can identify these elements using the class name locator
available in Selenium.
CSS is the language used to style HTML pages. We can use css selector locator strategy
to identify the element on the page. If the element has an id, we create the locator
as css = #id. Otherwise the format we follow is css =[attribute=value] .
Let us see an example from above HTML snippet. We will create locator for First Name
textbox, using css.
We can use the ID attribute available with element in a web page to locate it.
Generally the ID property should be unique for a element on the web page.
We will identify the Last Name field using it.
We can use the NAME attribute available with element in a web page to locate it.
Generally the NAME property should be unique for a element on the web page.
We will identify the Newsletter checkbox using it.
If the element we want to locate is a link, we can use the link text locator
to identify it on the web page. The link text is the text displayed of the link.
In the HTML snippet shared, we have a link available, lets see how will we locate it.
WebDriverdriver=newChromeDriver();driver.findElement(By.linkText("Selenium Official Page"));
driver=webdriver.Chrome()driver.find_element(By.LINK_TEXT,"Selenium Official Page")
vardriver=newChromeDriver();driver.FindElement(By.LinkText("Selenium Official Page"));
driver=Selenium::WebDriver.for:chromedriver.find_element(link_text:'Selenium Official Page')
letdriver=awaitnewBuilder().forBrowser('chrome').build();constloc=awaitdriver.findElement(By.linkText('Selenium Official Page'));
valdriver=ChromeDriver()valloc:WebElement=driver.findElement(By.linkText("Selenium Official Page"))
partial link text
If the element we want to locate is a link, we can use the partial link text locator
to identify it on the web page. The link text is the text displayed of the link.
We can pass partial text as value.
In the HTML snippet shared, we have a link available, lets see how will we locate it.
We can use the HTML TAG itself as a locator to identify the web element on the page.
From the above HTML snippet shared, lets identify the link, using its html tag “a”.
A HTML document can be considered as a XML document, and then we can use xpath
which will be the path traversed to reach the element of interest to locate the element.
The XPath could be absolute xpath, which is created from the root of the document.
Example - /html/form/input[1]. This will return the male radio button.
Or the xpath could be relative. Example- //input[@name=‘fname’]. This will return the
first name text box. Let us create locator for female radio button using xpath.
Selenium 4 introduces Relative Locators (previously
called as Friendly Locators). These locators are helpful when it is not easy to construct a locator for
the desired element, but easy to describe spatially where the element is in relation to an element that does have
an easily constructed locator.
How it works
Selenium uses the JavaScript function
getBoundingClientRect()
to determine the size and position of elements on the page, and can use this information to locate neighboring elements.
find the relative elements.
Relative locator methods can take as the argument for the point of origin, either a previously located element reference,
or another locator. In these examples we’ll be using locators only, but you could swap the locator in the final method with
an element object and it will work the same.
Let us consider the below example for understanding the relative locators.
Available relative locators
Above
If the email text field element is not easily identifiable for some reason, but the password text field element is,
we can locate the text field element using the fact that it is an “input” element “above” the password element.
If the password text field element is not easily identifiable for some reason, but the email text field element is,
we can locate the text field element using the fact that it is an “input” element “below” the email element.
If the cancel button is not easily identifiable for some reason, but the submit button element is,
we can locate the cancel button element using the fact that it is a “button” element to the “left of” the submit element.
If the submit button is not easily identifiable for some reason, but the cancel button element is,
we can locate the submit button element using the fact that it is a “button” element “to the right of” the cancel element.
If the relative positioning is not obvious, or it varies based on window size, you can use the near method to
identify an element that is at most 50px away from the provided locator.
One great use case for this is to work with a form element that doesn’t have an easily constructed locator,
but its associated input label element does.
You can also chain locators if needed. Sometimes the element is most easily identified as being both above/below one element and right/left of another.
Locating the elements based on the provided locator values.
One of the most fundamental aspects of using Selenium is obtaining element references to work with.
Selenium offers a number of built-in locator strategies to uniquely identify an element.
There are many ways to use the locators in very advanced scenarios. For the purposes of this documentation,
let’s consider this HTML snippet:
<olid="vegetables"><liclass="potatoes">…
<liclass="onions">…
<liclass="tomatoes"><span>Tomato is a Vegetable</span>…
</ol><ulid="fruits"><liclass="bananas">…
<liclass="apples">…
<liclass="tomatoes"><span>Tomato is a Fruit</span>…
</ul>
First matching element
Many locators will match multiple elements on the page. The singular find element method will return a reference to the
first element found within a given context.
Evaluating entire DOM
When the find element method is called on the driver instance, it
returns a reference to the first element in the DOM that matches with the provided locator.
This value can be stored and used for future element actions. In our example HTML above, there are
two elements that have a class name of “tomatoes” so this method will return the element in the “vegetables” list.
Rather than finding a unique locator in the entire DOM, it is often useful to narrow the search to the scope
of another located element. In the above example there are two elements with a class name of “tomatoes” and
it is a little more challenging to get the reference for the second one.
One solution is to locate an element with a unique attribute that is an ancestor of the desired element and not an
ancestor of the undesired element, then call find element on that object:
Java and C# WebDriver, WebElement and ShadowRoot classes all implement a SearchContext interface, which is
considered a role-based interface. Role-based interfaces allow you to determine whether a particular
driver implementation supports a given feature. These interfaces are clearly defined and try
to adhere to having only a single role of responsibility.
Optimized locator
A nested lookup might not be the most effective location strategy since it requires two
separate commands to be issued to the browser.
There are several use cases for needing to get references to all elements that match a locator, rather
than just the first one. The plural find elements methods return a collection of element references.
If there are no matches, an empty list is returned. In this case,
references to all fruits and vegetable list items will be returned in a collection.
Often you get a collection of elements but want to work with a specific element, which means you
need to iterate over the collection and identify the one you want.
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Firefox()# Navigate to Urldriver.get("https://www.example.com")# Get all the elements available with tag name 'p'elements=driver.find_elements(By.TAG_NAME,'p')foreinelements:print(e.text)
usingOpenQA.Selenium;usingOpenQA.Selenium.Firefox;usingSystem.Collections.Generic;namespaceFindElementsExample{classFindElementsExample{publicstaticvoidMain(string[]args){IWebDriverdriver=newFirefoxDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");// Get all the elements available with tag name 'p'IList<IWebElement>elements=driver.FindElements(By.TagName("p"));foreach(IWebElementeinelements){System.Console.WriteLine(e.Text);}}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxbegin# Navigate to URLdriver.get'https://www.example.com'# Get all the elements available with tag name 'p'elements=driver.find_elements(:tag_name,'p')elements.each{|e|putse.text}ensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('firefox').build();try{// Navigate to Url
awaitdriver.get('https://www.example.com');// Get all the elements available with tag 'p'
letelements=awaitdriver.findElements(By.css('p'));for(leteofelements){console.log(awaite.getText());}}finally{awaitdriver.quit();}})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.firefox.FirefoxDriverfunmain(){valdriver=FirefoxDriver()try{driver.get("https://example.com")// Get all the elements available with tag name 'p'
valelements=driver.findElements(By.tagName("p"))for(elementinelements){println("Paragraph text:"+element.text)}}finally{driver.quit()}}
Find Elements From Element
It is used to find the list of matching child WebElements within the context of parent element.
To achieve this, the parent WebElement is chained with ‘findElements’ to access child elements
importorg.openqa.selenium.By;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.WebElement;importorg.openqa.selenium.chrome.ChromeDriver;importjava.util.List;publicclassfindElementsFromElement{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("https://example.com");// Get element with tag name 'div'
WebElementelement=driver.findElement(By.tagName("div"));// Get all the elements available with tag name 'p'
List<WebElement>elements=element.findElements(By.tagName("p"));for(WebElemente:elements){System.out.println(e.getText());}}finally{driver.quit();}}}
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Chrome()driver.get("https://www.example.com")# Get element with tag name 'div'element=driver.find_element(By.TAG_NAME,'div')# Get all the elements available with tag name 'p'elements=element.find_elements(By.TAG_NAME,'p')foreinelements:print(e.text)
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingSystem.Collections.Generic;namespaceFindElementsFromElement{classFindElementsFromElement{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{driver.Navigate().GoToUrl("https://example.com");// Get element with tag name 'div'IWebElementelement=driver.FindElement(By.TagName("div"));// Get all the elements available with tag name 'p'IList<IWebElement>elements=element.FindElements(By.TagName("p"));foreach(IWebElementeinelements){System.Console.WriteLine(e.Text);}}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegin# Navigate to URLdriver.get'https://www.example.com'# Get element with tag name 'div'element=driver.find_element(:tag_name,'div')# Get all the elements available with tag name 'p'elements=element.find_elements(:tag_name,'p')elements.each{|e|putse.text}ensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=newBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');// Get element with tag name 'div'
letelement=driver.findElement(By.css("div"));// Get all the elements available with tag name 'p'
letelements=awaitelement.findElements(By.css("p"));for(leteofelements){console.log(awaite.getText());}})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")// Get element with tag name 'div'
valelement=driver.findElement(By.tagName("div"))// Get all the elements available with tag name 'p'
valelements=element.findElements(By.tagName("p"))for(einelements){println(e.text)}}finally{driver.quit()}}
Get Active Element
It is used to track (or) find DOM element which has the focus in the current browsing context.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassactiveElementTest{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.google.com");driver.findElement(By.cssSelector("[name='q']")).sendKeys("webElement");// Get attribute of current active element
Stringattr=driver.switchTo().activeElement().getAttribute("title");System.out.println(attr);}finally{driver.quit();}}}
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Chrome()driver.get("https://www.google.com")driver.find_element(By.CSS_SELECTOR,'[name="q"]').send_keys("webElement")# Get attribute of current active elementattr=driver.switch_to.active_element.get_attribute("title")print(attr)
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceActiveElement{classActiveElement{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://www.google.com");driver.FindElement(By.CssSelector("[name='q']")).SendKeys("webElement");// Get attribute of current active elementstringattr=driver.SwitchTo().ActiveElement().GetAttribute("title");System.Console.WriteLine(attr);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.google.com'driver.find_element(css:'[name="q"]').send_keys('webElement')# Get attribute of current active elementattr=driver.switch_to.active_element.attribute('title')putsattrensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.google.com');awaitdriver.findElement(By.css('[name="q"]')).sendKeys("webElement");// Get attribute of current active element
letattr=awaitdriver.switchTo().activeElement().getAttribute("title");console.log(`${attr}`)})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://www.google.com")driver.findElement(By.cssSelector("[name='q']")).sendKeys("webElement")// Get attribute of current active element
valattr=driver.switchTo().activeElement().getAttribute("title")print(attr)}finally{driver.quit()}}
5.4 - Interacting with web elements
A high-level instruction set for manipulating form controls.
There are only 5 basic commands that can be executed on an element:
These methods are designed to closely emulate a user’s experience, so,
unlike the Actions API, it attempts to perform two things
before attempting the specified action.
If it determines the element is outside the viewport, it
scrolls the element into view, specifically
it will align the bottom of the element with the bottom of the viewport.
It ensures the element is interactable
before taking the action. This could mean that the scrolling was unsuccessful, or that the
element is not otherwise displayed. Determining if an element is displayed on a page was too difficult to
define directly in the webdriver specification,
so Selenium sends an execute command with a JavaScript atom that checks for things that would keep
the element from being displayed. If it determines an element is not in the viewport, not displayed, not
keyboard-interactable, or not
pointer-interactable,
it returns an element not interactable error.
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");// Click on the element
driver.findElement(By.name("color_input")).click();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Click on the element driver.find_element(By.NAME,"color_input").click()
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Click the elementdriver.FindElement(By.Name("color_input")).Click();
# Navigate to URLdriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Click the elementdriver.find_element(name:'color_input').click
// Navigate to Url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');// Click the element
awaitdriver.findElement(By.name('color_input')).click();
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")// Click the element
driver.findElement(By.name("color_input")).click();
Send keys
The element send keys command
types the provided keys into an editable element.
Typically, this means an element is an input element of a form with a text type or an element
with a content-editable attribute. If it is not editable,
an invalid element state error is returned.
Here is the list of
possible keystrokes that WebDriver Supports.
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");// Clear field to empty it from any previous data
driver.findElement(By.name("email_input")).clear();//Enter Text
driver.findElement(By.name("email_input")).sendKeys("admin@localhost.dev");
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Clear field to empty it from any previous datadriver.find_element(By.NAME,"email_input").clear()# Enter Textdriver.find_element(By.NAME,"email_input").send_keys("admin@localhost.dev")
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Clear field to empty it from any previous datadriver.FindElement(By.Name("email_input")).Clear();//Enter Textdriver.FindElement(By.Name("email_input")).SendKeys("admin@localhost.dev");}
# Navigate to URLdriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Clear field to empty it from any previous datadriver.find_element(name:'email_input').clear# Enter Textdriver.find_element(name:'email_input').send_keys'admin@localhost.dev'
// Navigate to Url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');//Clear field to empty it from any previous data
awaitdriver.findElement(By.name('email_input')).clear();// Enter text
awaitdriver.findElement(By.name('email_input')).sendKeys('admin@localhost.dev');
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//Clear field to empty it from any previous data
driver.findElement(By.name("email_input")).clear()// Enter text
driver.findElement(By.name("email_input")).sendKeys("admin@localhost.dev")
Clear
The element clear command resets the content of an element.
This requires an element to be editable,
and resettable. Typically,
this means an element is an input element of a form with a text type or an element
with acontent-editable attribute. If these conditions are not met,
an invalid element state error is returned.
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");// Clear field to empty it from any previous data
driver.findElement(By.name("email_input")).clear();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Clear field to empty it from any previous datadriver.find_element(By.NAME,"email_input").clear()
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Clear field to empty it from any previous datadriver.FindElement(By.Name("email_input")).Clear();}
# Navigate to URLdriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Clear field to empty it from any previous datadriver.find_element(name:'email_input').clear
// Navigate to Url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');//Clear field to empty it from any previous data
awaitdriver.findElement(By.name('email_input')).clear();
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//Clear field to empty it from any previous data
driver.findElement(By.name("email_input")).clear()
Submit
In Selenium 4 this is no longer implemented with a separate endpoint and functions by executing a script. As
such, it is recommended not to use this method and to click the applicable form submission button instead.
5.5 - Information about web elements
What you can learn about an element.
There are a number of details you can query about a specific element.
Is Displayed
This method is used to check if the connected Element is
displayed on a webpage. Returns a Boolean value,
True if the connected element is displayed in the current
browsing context else returns false.
This functionality is mentioned in, but not defined by
the w3c specification due to the
impossibility of covering all potential conditions.
As such, Selenium cannot expect drivers to implement
this functionality directly, and now relies on
executing a large JavaScript function directly.
This function makes many approximations about an element’s
nature and relationship in the tree to return a value.
// Navigate to the url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");// Get boolean value for is element display
booleanisEmailVisible=driver.findElement(By.name("email_input")).isDisplayed();
# Navigate to the urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Get boolean value for is element displayis_email_visible=driver.find_element(By.NAME,"email_input").is_displayed()
//Navigate to the urldriver.Url="https://www.selenium.dev/selenium/web/inputs.html";//Get boolean value for is element displayBooleanis_email_visible=driver.FindElement(By.Name("email_input")).Displayed;
# Navigate to the urldriver.get("https://www.selenium.dev/selenium/web/inputs.html");#fetch display statusval=driver.find_element(name:'email_input').displayed?
// Navigate to url
awaitdriver.get("https://www.selenium.dev/selenium/web/inputs.html");// Resolves Promise and returns boolean value
letresult=awaitdriver.findElement(By.name("email_input")).isDisplayed();
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//returns true if element is displayed else returns false
valflag=driver.findElement(By.name("email_input")).isDisplayed()
Is Enabled
This method is used to check if the connected Element
is enabled or disabled on a webpage.
Returns a boolean value, True if the connected element is
enabled in the current browsing context else returns false.
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");//returns true if element is enabled else returns false
booleanvalue=driver.findElement(By.name("button_input")).isEnabled();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Returns true if element is enabled else returns falsevalue=driver.find_element(By.NAME,'button_input').is_enabled()
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Store the WebElementIWebElementelement=driver.FindElement(By.Name("button_input"));// Prints true if element is enabled else returns falseSystem.Console.WriteLine(element.Enabled);
# Navigate to urldriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Returns true if element is enabled else returns falseele=driver.find_element(name:'button_input').enabled?
// Navigate to url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');// Resolves Promise and returns boolean value
letelement=awaitdriver.findElement(By.name("button_input")).isEnabled();
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//returns true if element is enabled else returns false
valattr=driver.findElement(By.name("button_input")).isEnabled()
Is Selected
This method determines if the referenced Element
is Selected or not. This method is widely used on
Check boxes, radio buttons, input elements, and option elements.
Returns a boolean value, True if referenced element is
selected in the current browsing context else returns false.
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");//returns true if element is checked else returns false
booleanvalue=driver.findElement(By.name("checkbox_input")).isSelected();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Returns true if element is checked else returns falsevalue=driver.find_element(By.NAME,"checkbox_input").is_selected()
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Returns true if element ins checked else returns falseboolvalue=driver.FindElement(By.Name("checkbox_input")).Selected;
# Navigate to urldriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Returns true if element is checked else returns falseele=driver.find_element(name:"checkbox_input").selected?
// Navigate to url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');// Returns true if element ins checked else returns false
letres=awaitdriver.findElement(By.name("checkbox_input")).isSelected();
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//returns true if element is checked else returns false
valattr=driver.findElement(By.name("checkbox_input")).isSelected()
Tag Name
It is used to fetch the TagName
of the referenced Element which has the focus in the current browsing context.
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");//returns TagName of the element
Stringvalue=driver.findElement(By.name("email_input")).getTagName();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Returns TagName of the elementattr=driver.find_element(By.NAME,"email_input").tag_name
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");// Returns TagName of the elementstringattr=driver.FindElement(By.Name("email_input")).TagName;
# Navigate to urldriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Returns TagName of the elementattr=driver.find_element(name:"email_input").tag_name
// Navigate to URL
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');// Returns TagName of the element
letvalue=awaitdriver.findElement(By.name('email_input')).getTagName();
//navigates to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//returns TagName of the element
valattr=driver.findElement(By.name("email_input")).getTagName()
Size and Position
It is used to fetch the dimensions and coordinates
of the referenced element.
The fetched data body contain the following details:
X-axis position from the top-left corner of the element
y-axis position from the top-left corner of the element
Height of the element
Width of the element
// Navigate to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");// Returns height, width, x and y coordinates referenced element
Rectangleres=driver.findElement(By.name("range_input")).getRect();// Rectangle class provides getX,getY, getWidth, getHeight methods
System.out.println(res.getX());
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Returns height, width, x and y coordinates referenced elementres=driver.find_element(By.NAME,"range_input").rect
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/inputs.html");varres=driver.FindElement(By.Name("range_input"));// Return x and y coordinates referenced elementSystem.Console.WriteLine(res.Location);// Returns height, widthSystem.Console.WriteLine(res.Size);
# Navigate to urldriver.get'https://www.selenium.dev/selenium/web/inputs.html'# Returns height, width, x and y coordinates referenced elementres=driver.find_element(name:"range_input").rect
// Navigate to url
awaitdriver.get('https://www.selenium.dev/selenium/web/inputs.html');// Returns height, width, x and y coordinates referenced element
letelement=awaitdriver.findElement(By.name("range_input")).getRect();
// Navigate to url
driver.get("https://www.selenium.dev/selenium/web/inputs.html")// Returns height, width, x and y coordinates referenced element
valres=driver.findElement(By.name("range_input")).rect// Rectangle class provides getX,getY, getWidth, getHeight methods
println(res.getX())
Get CSS Value
Retrieves the value of specified computed style property
of an element in the current browsing context.
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/colorPage.html");// Retrieves the computed style property 'color' of linktext
StringcssValue=driver.findElement(By.id("namedColor")).getCssValue("background-color");
# Navigate to Urldriver.get('https://www.selenium.dev/selenium/web/colorPage.html')# Retrieves the computed style property 'color' of linktextcssValue=driver.find_element(By.ID,"namedColor").value_of_css_property('background-color')
// Navigate to Urldriver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/colorPage.html");// Retrieves the computed style property 'color' of linktextStringcssValue=driver.FindElement(By.Id("namedColor")).GetCssValue("background-color");
# Navigate to Urldriver.get'https://www.selenium.dev/selenium/web/colorPage.html'# Retrieves the computed style property 'color' of linktextcssValue=driver.find_element(:id,'namedColor').css_value('background-color')
// Navigate to Url
awaitdriver.get('https://www.selenium.dev/selenium/web/colorPage.html');// Retrieves the computed style property 'color' of linktext
letcssValue=awaitdriver.findElement(By.id("namedColor")).getCssValue('background-color');
// Navigate to Url
driver.get("https://www.selenium.dev/selenium/web/colorPage.html")// Retrieves the computed style property 'color' of linktext
valcssValue=driver.findElement(By.id("namedColor")).getCssValue("background-color")
Text Content
Retrieves the rendered text of the specified element.
// Navigate to url
driver.get("https://www.selenium.dev/selenium/web/linked_image.html");// Retrieves the text of the element
Stringtext=driver.findElement(By.id("justanotherlink")).getText();
# Navigate to urldriver.get("https://www.selenium.dev/selenium/web/linked_image.html")# Retrieves the text of the elementtext=driver.find_element(By.ID,"justanotherlink").text
// Navigate to urldriver.Url="https://www.selenium.dev/selenium/web/linked_image.html";// Retrieves the text of the elementStringtext=driver.FindElement(By.Id("justanotherlink")).Text;
# Navigate to urldriver.get'https://www.selenium.dev/selenium/web/linked_image.html'# Retrieves the text of the elementtext=driver.find_element(:id,'justanotherlink').text
// Navigate to URL
awaitdriver.get('https://www.selenium.dev/selenium/web/linked_image.html');// retrieves the text of the element
lettext=awaitdriver.findElement(By.id('justanotherlink')).getText();
// Navigate to URL
driver.get("https://www.selenium.dev/selenium/web/linked_image.html")// retrieves the text of the element
valtext=driver.findElement(By.id("justanotherlink")).getText()
Fetching Attributes or Properties
Fetches the run time value associated with a
DOM attribute. It returns the data associated
with the DOM attribute or property of the element.
//Navigate to the url
driver.get("https://www.selenium.dev/selenium/web/inputs.html");//identify the email text box
WebElementemailTxt=driver.findElement(By.name(("email_input")));//fetch the value property associated with the textbox
StringvalueInfo=eleSelLink.getAttribute("value");
# Navigate to the urldriver.get("https://www.selenium.dev/selenium/web/inputs.html")# Identify the email text boxemail_txt=driver.find_element(By.NAME,"email_input")# Fetch the value property associated with the textboxvalue_info=email_txt.get_attribute("value")
//Navigate to the urldriver.Url="https://www.selenium.dev/selenium/web/inputs.html";//identify the email text boxIWebElementemailTxt=driver.FindElement(By.Name(("email_input")));//fetch the value property associated with the textboxStringvalueInfo=eleSelLink.GetAttribute("value");
# Navigate to the urldriver.get("https://www.selenium.dev/selenium/web/inputs.html");#identify the email text boxemail_element=driver.find_element(name:'email_input')#fetch the value property associated with the textboxemailVal=email_element.attribute("value");
// Navigate to the Url
awaitdriver.get("https://www.selenium.dev/selenium/web/inputs.html");// identify the email text box
constemailElement=awaitdriver.findElements(By.xpath('//input[@name="email_input"]'));//fetch the attribute "name" associated with the textbox
constnameAttribute=awaitemailElement.getAttribute("name");
// Navigate to URL
driver.get("https://www.selenium.dev/selenium/web/inputs.html")//fetch the value property associated with the textbox
valattr=driver.findElement(By.name("email_input")).getAttribute("value")
6 - Browser interactions
Get browser information
Get title
You can read the current page title from the browser:
driver.getTitle();
driver.title
driver.Title;
driver.title
awaitdriver.getTitle();
driver.title
Get current URL
You can read the current URL from the browser’s address bar using:
driver.getCurrentUrl();
driver.current_url
driver.Url;
driver.current_url
awaitdriver.getCurrentUrl();
driver.currentUrl
6.1 - Browser navigation
Navigate to
The first thing you will want to do after launching a browser is to
open your website. This can be achieved in a single line:
//Convenient
driver.get("https://selenium.dev");//Longer way
driver.navigate().to("https://selenium.dev");
//Convenient
driver.get("https://selenium.dev")//Longer way
driver.navigate().to("https://selenium.dev")
Back
Pressing the browser’s back button:
driver.navigate().back();
driver.back()
driver.Navigate().Back();
driver.navigate.back
awaitdriver.navigate().back();
driver.navigate().back()
Forward
Pressing the browser’s forward button:
driver.navigate().forward();
driver.forward()
driver.Navigate().Forward();
driver.navigate.forward
awaitdriver.navigate().forward();
driver.navigate().forward()
Refresh
Refresh the current page:
driver.navigate().refresh();
driver.refresh()
driver.Navigate().Refresh();
driver.navigate.refresh
awaitdriver.navigate().refresh();
driver.navigate().refresh()
6.2 - JavaScript alerts, prompts and confirmations
WebDriver provides an API for working with the three types of native
popup messages offered by JavaScript. These popups are styled by the
browser and offer limited customisation.
Alerts
The simplest of these is referred to as an alert, which shows a
custom message, and a single button which dismisses the alert, labelled
in most browsers as OK. It can also be dismissed in most browsers by
pressing the close button, but this will always do the same thing as
the OK button. See an example alert.
WebDriver can get the text from the popup and accept or dismiss these
alerts.
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click();//Wait for the alert to be displayed and store it in a variable
Alertalert=wait.until(ExpectedConditions.alertIsPresent());//Store the alert text in a variable
Stringtext=alert.getText();//Press the OK button
alert.accept();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See an example alert").click()# Wait for the alert to be displayed and store it in a variablealert=wait.until(expected_conditions.alert_is_present())# Store the alert text in a variabletext=alert.text# Press the OK buttonalert.accept()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See an example alert")).Click();//Wait for the alert to be displayed and store it in a variableIAlertalert=wait.Until(ExpectedConditions.AlertIsPresent());//Store the alert text in a variablestringtext=alert.Text;//Press the OK buttonalert.Accept();
# Click the link to activate the alertdriver.find_element(:link_text,'See an example alert').click# Store the alert reference in a variablealert=driver.switch_to.alert# Store the alert text in a variablealert_text=alert.text# Press on OK buttonalert.accept
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See an example alert')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Store the alert text in a variable
letalertText=awaitalert.getText();//Press the OK button
awaitalert.accept();// Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click()//Wait for the alert to be displayed and store it in a variable
valalert=wait.until(ExpectedConditions.alertIsPresent())//Store the alert text in a variable
valtext=alert.getText()//Press the OK button
alert.accept()
Confirm
A confirm box is similar to an alert, except the user can also choose
to cancel the message. See
a sample confirm.
This example also shows a different approach to storing an alert:
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click();//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent());//Store the alert in a variable
Alertalert=driver.switchTo().alert();//Store the alert in a variable for reuse
Stringtext=alert.getText();//Press the Cancel button
alert.dismiss();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See a sample confirm").click()# Wait for the alert to be displayedwait.until(expected_conditions.alert_is_present())# Store the alert in a variable for reusealert=driver.switch_to.alert# Store the alert text in a variabletext=alert.text# Press the Cancel buttonalert.dismiss()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See a sample confirm")).Click();//Wait for the alert to be displayedwait.Until(ExpectedConditions.AlertIsPresent());//Store the alert in a variableIAlertalert=driver.SwitchTo().Alert();//Store the alert in a variable for reusestringtext=alert.Text;//Press the Cancel buttonalert.Dismiss();
# Click the link to activate the alertdriver.find_element(:link_text,'See a sample confirm').click# Store the alert reference in a variablealert=driver.switch_to.alert# Store the alert text in a variablealert_text=alert.text# Press on Cancel buttonalert.dismiss
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See a sample confirm')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Store the alert text in a variable
letalertText=awaitalert.getText();//Press the Cancel button
awaitalert.dismiss();// Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click()//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent())//Store the alert in a variable
valalert=driver.switchTo().alert()//Store the alert in a variable for reuse
valtext=alert.text//Press the Cancel button
alert.dismiss()
Prompt
Prompts are similar to confirm boxes, except they also include a text
input. Similar to working with form elements, you can use WebDriver’s
send keys to fill in a response. This will completely replace the placeholder
text. Pressing the cancel button will not submit any text.
See a sample prompt.
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click();//Wait for the alert to be displayed and store it in a variable
Alertalert=wait.until(ExpectedConditions.alertIsPresent());//Type your message
alert.sendKeys("Selenium");//Press the OK button
alert.accept();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See a sample prompt").click()# Wait for the alert to be displayedwait.until(expected_conditions.alert_is_present())# Store the alert in a variable for reusealert=Alert(driver)# Type your messagealert.send_keys("Selenium")# Press the OK buttonalert.accept()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See a sample prompt")).Click();//Wait for the alert to be displayed and store it in a variableIAlertalert=wait.Until(ExpectedConditions.AlertIsPresent());//Type your messagealert.SendKeys("Selenium");//Press the OK buttonalert.Accept();
# Click the link to activate the alertdriver.find_element(:link_text,'See a sample prompt').click# Store the alert reference in a variablealert=driver.switch_to.alert# Type a messagealert.send_keys("selenium")# Press on Ok buttonalert.accept
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See a sample prompt')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Type your message
awaitalert.sendKeys("Selenium");//Press the OK button
awaitalert.accept();//Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click()//Wait for the alert to be displayed and store it in a variable
valalert=wait.until(ExpectedConditions.alertIsPresent())//Type your message
alert.sendKeys("Selenium")//Press the OK button
alert.accept()
6.3 - Working with cookies
A cookie is a small piece of data that is sent from a website and stored in your computer.
Cookies are mostly used to recognise the user and load the stored information.
WebDriver API provides a way to interact with cookies with built-in methods:
Add Cookie
It is used to add a cookie to the current browsing context.
Add Cookie only accepts a set of defined serializable JSON object. Here
is the link to the list of accepted JSON key values
First of all, you need to be on the domain that the cookie will be
valid for. If you are trying to preset cookies before
you start interacting with a site and your homepage is large / takes a while to load
an alternative is to find a smaller page on the site (typically the 404 page is small,
e.g. http://example.com/some404page)
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassaddCookie{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");// Adds the cookie into current browser context
driver.manage().addCookie(newCookie("key","value"));}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()driver.get("http://www.example.com")# Adds the cookie into current browser contextdriver.add_cookie({"name":"key","value":"value"})
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceAddCookie{classAddCookie{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");// Adds the cookie into current browser contextdriver.Manage().Cookies.AddCookie(newCookie("key","value"));}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'# Adds the cookie into current browser contextdriver.manage.add_cookie(name:"key",value:"value")ensuredriver.quitend
it('Create a cookie',asyncfunction(){awaitdriver.get('https://www.example.com');// set a cookie on the current domain
awaitdriver.manage().addCookie({name:'key',value:'value'});
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")// Adds the cookie into current browser context
driver.manage().addCookie(Cookie("key","value"))}finally{driver.quit()}}
Get Named Cookie
It returns the serialized cookie data matching with the cookie name among all associated cookies.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassgetCookieNamed{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("foo","bar"));// Get cookie details with named cookie 'foo'
Cookiecookie1=driver.manage().getCookieNamed("foo");System.out.println(cookie1);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")# Adds the cookie into current browser contextdriver.add_cookie({"name":"foo","value":"bar"})# Get cookie details with named cookie 'foo'print(driver.get_cookie("foo"))
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceGetCookieNamed{classGetCookieNamed{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("foo","bar"));// Get cookie details with named cookie 'foo'varcookie=driver.Manage().Cookies.GetCookieNamed("foo");System.Console.WriteLine(cookie);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"foo",value:"bar")# Get cookie details with named cookie 'foo'putsdriver.manage.cookie_named('foo')ensuredriver.quitend
it('Read cookie',asyncfunction(){awaitdriver.get('https://www.example.com');// set a cookie on the current domain
awaitdriver.manage().addCookie({name:'foo',value:'bar'});// Get cookie details with named cookie 'foo'
awaitdriver.manage().getCookie('foo').then(function(cookie){console.log('cookie details => ',cookie);});
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("foo","bar"))// Get cookie details with named cookie 'foo'
valcookie=driver.manage().getCookieNamed("foo")println(cookie)}finally{driver.quit()}}
Get All Cookies
It returns a ‘successful serialized cookie data’ for current browsing context.
If browser is no longer available it returns error.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;importjava.util.Set;publicclassgetAllCookies{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");// Add few cookies
driver.manage().addCookie(newCookie("test1","cookie1"));driver.manage().addCookie(newCookie("test2","cookie2"));// Get All available cookies
Set<Cookie>cookies=driver.manage().getCookies();System.out.println(cookies);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Get all available cookiesprint(driver.get_cookies())
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceGetAllCookies{classGetAllCookies{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));driver.Manage().Cookies.AddCookie(newCookie("test2","cookie2"));// Get All available cookiesvarcookies=driver.Manage().Cookies.AllCookies;}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# Get all available cookiesputsdriver.manage.all_cookiesensuredriver.quitend
it('Read all cookies',asyncfunction(){awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Get all Available cookies
awaitdriver.manage().getCookies().then(function(cookies){console.log('cookie details => ',cookies);});
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))driver.manage().addCookie(Cookie("test2","cookie2"))// Get All available cookies
valcookies=driver.manage().cookiesprintln(cookies)}finally{driver.quit()}}
Delete Cookie
It deletes the cookie data matching with the provided cookie name.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassdeleteCookie{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("test1","cookie1"));Cookiecookie1=newCookie("test2","cookie2");driver.manage().addCookie(cookie1);// delete a cookie with name 'test1'
driver.manage().deleteCookieNamed("test1");/*
Selenium Java bindings also provides a way to delete
cookie by passing cookie object of current browsing context
*/driver.manage().deleteCookie(cookie1);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Delete a cookie with name 'test1'driver.delete_cookie("test1")
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceDeleteCookie{classDeleteCookie{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));varcookie=newCookie("test2","cookie2");driver.Manage().Cookies.AddCookie(cookie);// delete a cookie with name 'test1' driver.Manage().Cookies.DeleteCookieNamed("test1");// Selenium .net bindings also provides a way to delete// cookie by passing cookie object of current browsing contextdriver.Manage().Cookies.DeleteCookie(cookie);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# delete a cookie with name 'test1'driver.manage.delete_cookie('test1')ensuredriver.quitend
it('Delete a cookie',asyncfunction(){awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Delete a cookie with name 'test1'
awaitdriver.manage().deleteCookie('test1');// Get all Available cookies
awaitdriver.manage().getCookies().then(function(cookies){console.log('cookie details => ',cookies);});
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))valcookie1=Cookie("test2","cookie2")driver.manage().addCookie(cookie1)// delete a cookie with name 'test1'
driver.manage().deleteCookieNamed("test1")// delete cookie by passing cookie object of current browsing context.
driver.manage().deleteCookie(cookie1)}finally{driver.quit()}}
Delete All Cookies
It deletes all the cookies of the current browsing context.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassdeleteAllCookies{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("test1","cookie1"));driver.manage().addCookie(newCookie("test2","cookie2"));// deletes all cookies
driver.manage().deleteAllCookies();}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Deletes all cookiesdriver.delete_all_cookies()
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceDeleteAllCookies{classDeleteAllCookies{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));driver.Manage().Cookies.AddCookie(newCookie("test2","cookie2"));// deletes all cookiesdriver.Manage().Cookies.DeleteAllCookies();}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# deletes all cookiesdriver.manage.delete_all_cookiesensuredriver.quitend
it('Delete all cookies',asyncfunction(){awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Delete all cookies
awaitdriver.manage().deleteAllCookies();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))driver.manage().addCookie(Cookie("test2","cookie2"))// deletes all cookies
driver.manage().deleteAllCookies()}finally{driver.quit()}}
Same-Site Cookie Attribute
It allows a user to instruct browsers to control whether cookies
are sent along with the request initiated by third party sites.
It is introduced to prevent CSRF (Cross-Site Request Forgery) attacks.
Same-Site cookie attribute accepts two parameters as instructions
Strict:
When the sameSite attribute is set as Strict,
the cookie will not be sent along with
requests initiated by third party websites.
Lax:
When you set a cookie sameSite attribute to Lax,
the cookie will be sent along with the GET
request initiated by third party website.
Note: As of now this feature is landed in chrome(80+version),
Firefox(79+version) and works with Selenium 4 and later versions.
fromseleniumimportwebdriverdriver=webdriver.Chrome()driver.get("http://www.example.com")# Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'driver.add_cookie({"name":"foo","value":"value",'sameSite':'Strict'})driver.add_cookie({"name":"foo1","value":"value",'sameSite':'Lax'})cookie1=driver.get_cookie('foo')cookie2=driver.get_cookie('foo1')print(cookie1)print(cookie2)
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'# Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'driver.manage.add_cookie(name:"foo",value:"bar",same_site:"Strict")driver.manage.add_cookie(name:"foo1",value:"bar",same_site:"Lax")putsdriver.manage.cookie_named('foo')putsdriver.manage.cookie_named('foo1')ensuredriver.quitend
it('Create cookies with sameSite',asyncfunction(){awaitdriver.get('https://www.example.com');// set a cookie on the current domain with sameSite 'Strict' (or) 'Lax'
awaitdriver.manage().addCookie({name:'key',value:'value',sameSite:'Strict'});awaitdriver.manage().addCookie({name:'key',value:'value',sameSite:'Lax'});
Frames are a now deprecated means of building a site layout from
multiple documents on the same domain. You are unlikely to work with
them unless you are working with an pre HTML5 webapp. Iframes allow
the insertion of a document from an entirely different domain, and are
still commonly used.
If you need to work with frames or iframes, WebDriver allows you to
work with them in the same way. Consider a button within an iframe.
If we inspect the element using the browser development tools, we might
see the following:
# This won't workdriver.find_element(:tag_name,'button').click
// This won't work
awaitdriver.findElement(By.css('button')).click();
//This won't work
driver.findElement(By.tagName("button")).click()
However, if there are no buttons outside of the iframe, you might
instead get a no such element error. This happens because Selenium is
only aware of the elements in the top level document. To interact with
the button, we will need to first switch to the frame, in a similar way
to how we switch windows. WebDriver offers three ways of switching to
a frame.
Using a WebElement
Switching using a WebElement is the most flexible option. You can
find the frame using your preferred selector and switch to it.
//Store the web element
WebElementiframe=driver.findElement(By.cssSelector("#modal>iframe"));//Switch to the frame
driver.switchTo().frame(iframe);//Now we can click the button
driver.findElement(By.tagName("button")).click();
# Store iframe web elementiframe=driver.find_element(By.CSS_SELECTOR,"#modal > iframe")# switch to selected iframedriver.switch_to.frame(iframe)# Now click on buttondriver.find_element(By.TAG_NAME,'button').click()
//Store the web elementIWebElementiframe=driver.FindElement(By.CssSelector("#modal>iframe"));//Switch to the framedriver.SwitchTo().Frame(iframe);//Now we can click the buttondriver.FindElement(By.TagName("button")).Click();
# Store iframe web elementiframe=driver.find_element(:css,'#modal > iframe')# Switch to the framedriver.switch_to.frameiframe# Now, Click on the buttondriver.find_element(:tag_name,'button').click
// Store the web element
constiframe=driver.findElement(By.css('#modal > iframe'));// Switch to the frame
awaitdriver.switchTo().frame(iframe);// Now we can click the button
awaitdriver.findElement(By.css('button')).click();
//Store the web element
valiframe=driver.findElement(By.cssSelector("#modal>iframe"))//Switch to the frame
driver.switchTo().frame(iframe)//Now we can click the button
driver.findElement(By.tagName("button")).click()
Using a name or ID
If your frame or iframe has an id or name attribute, this can be used
instead. If the name or ID is not unique on the page, then the first
one found will be switched to.
//Using the ID
driver.switchTo().frame("buttonframe");//Or using the name instead
driver.switchTo().frame("myframe");//Now we can click the button
driver.findElement(By.tagName("button")).click();
# Switch frame by iddriver.switch_to.frame('buttonframe')# Now, Click on the buttondriver.find_element(By.TAG_NAME,'button').click()
//Using the IDdriver.SwitchTo().Frame("buttonframe");//Or using the name insteaddriver.SwitchTo().Frame("myframe");//Now we can click the buttondriver.FindElement(By.TagName("button")).Click();
# Switch by IDdriver.switch_to.frame'buttonframe'# Now, Click on the buttondriver.find_element(:tag_name,'button').click
// Using the ID
awaitdriver.switchTo().frame('buttonframe');// Or using the name instead
awaitdriver.switchTo().frame('myframe');// Now we can click the button
awaitdriver.findElement(By.css('button')).click();
//Using the ID
driver.switchTo().frame("buttonframe")//Or using the name instead
driver.switchTo().frame("myframe")//Now we can click the button
driver.findElement(By.tagName("button")).click()
Using an index
It is also possible to use the index of the frame, such as can be
queried using window.frames in JavaScript.
// Switches to the second frame
driver.switchTo().frame(1);
# Switch to the second framedriver.switch_to.frame(1)
// Switches to the second framedriver.SwitchTo().Frame(1);
# switching to second iframe based on indexiframe=driver.find_elements(By.TAG_NAME,'iframe')[1]# switch to selected iframedriver.switch_to.frame(iframe)
// Switches to the second frame
awaitdriver.switchTo().frame(1);
// Switches to the second frame
driver.switchTo().frame(1)
Leaving a frame
To leave an iframe or frameset, switch back to the default content
like so:
// Return to the top level
driver.switchTo().defaultContent();
# switch back to default contentdriver.switch_to.default_content()
// Return to the top leveldriver.SwitchTo().DefaultContent();
# Return to the top leveldriver.switch_to.default_content
// Return to the top level
awaitdriver.switchTo().defaultContent();
// Return to the top level
driver.switchTo().defaultContent()
6.5 - Working with windows and tabs
Windows and tabs
Get window handle
WebDriver does not make the distinction between windows and tabs. If
your site opens a new tab or window, Selenium will let you work with it
using a window handle. Each window has a unique identifier which remains
persistent in a single session. You can get the window handle of the
current window by using:
driver.getWindowHandle();
driver.current_window_handle
driver.CurrentWindowHandle;
driver.window_handle
awaitdriver.getWindowHandle();
driver.windowHandle
Switching windows or tabs
Clicking a link which opens in a
new window
will focus the new window or tab on screen, but WebDriver will not know which
window the Operating System considers active. To work with the new window
you will need to switch to it. If you have only two tabs or windows open,
and you know which window you start with, by the process of elimination
you can loop over both windows or tabs that WebDriver can see, and switch
to the one which is not the original.
However, Selenium 4 provides a new api NewWindow
which creates a new tab (or) new window and automatically switches to it.
//Store the ID of the original window
StringoriginalWindow=driver.getWindowHandle();//Check we don't have other windows open already
assertdriver.getWindowHandles().size()==1;//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click();//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2));//Loop through until we find a new window handle
for(StringwindowHandle:driver.getWindowHandles()){if(!originalWindow.contentEquals(windowHandle)){driver.switchTo().window(windowHandle);break;}}//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"));
fromseleniumimportwebdriverfromselenium.webdriver.support.uiimportWebDriverWaitfromselenium.webdriver.supportimportexpected_conditionsasECwithwebdriver.Firefox()asdriver:# Open URLdriver.get("https://seleniumhq.github.io")# Setup wait for laterwait=WebDriverWait(driver,10)# Store the ID of the original windoworiginal_window=driver.current_window_handle# Check we don't have other windows open alreadyassertlen(driver.window_handles)==1# Click the link which opens in a new windowdriver.find_element(By.LINK_TEXT,"new window").click()# Wait for the new window or tabwait.until(EC.number_of_windows_to_be(2))# Loop through until we find a new window handleforwindow_handleindriver.window_handles:ifwindow_handle!=original_window:driver.switch_to.window(window_handle)break# Wait for the new tab to finish loading contentwait.until(EC.title_is("SeleniumHQ Browser Automation"))
//Store the ID of the original windowstringoriginalWindow=driver.CurrentWindowHandle;//Check we don't have other windows open alreadyAssert.AreEqual(driver.WindowHandles.Count,1);//Click the link which opens in a new windowdriver.FindElement(By.LinkText("new window")).Click();//Wait for the new window or tabwait.Until(wd=>wd.WindowHandles.Count==2);//Loop through until we find a new window handleforeach(stringwindowindriver.WindowHandles){if(originalWindow!=window){driver.SwitchTo().Window(window);break;}}//Wait for the new tab to finish loading contentwait.Until(wd=>wd.Title=="Selenium documentation");
# Store the ID of the original windoworiginal_window=driver.window_handle# Check we don't have other windows open alreadyassert(driver.window_handles.length==1,'Expected one window')# Click the link which opens in a new windowdriver.find_element(link:'new window').click# Wait for the new window or tabwait.until{driver.window_handles.length==2}#Loop through until we find a new window handledriver.window_handles.eachdo|handle|ifhandle!=original_windowdriver.switch_to.windowhandlebreakendend#Wait for the new tab to finish loading contentwait.until{driver.title=='Selenium documentation'}
//Store the ID of the original window
constoriginalWindow=awaitdriver.getWindowHandle();//Check we don't have other windows open already
assert((awaitdriver.getAllWindowHandles()).length===1);//Click the link which opens in a new window
awaitdriver.findElement(By.linkText('new window')).click();//Wait for the new window or tab
awaitdriver.wait(async()=>(awaitdriver.getAllWindowHandles()).length===2,10000);//Loop through until we find a new window handle
constwindows=awaitdriver.getAllWindowHandles();windows.forEach(asynchandle=>{if(handle!==originalWindow){awaitdriver.switchTo().window(handle);}});//Wait for the new tab to finish loading content
awaitdriver.wait(until.titleIs('Selenium documentation'),10000);
//Store the ID of the original window
valoriginalWindow=driver.getWindowHandle()//Check we don't have other windows open already
assert(driver.getWindowHandles().size()===1)//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click()//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2))//Loop through until we find a new window handle
for(windowHandleindriver.getWindowHandles()){if(!originalWindow.contentEquals(windowHandle)){driver.switchTo().window(windowHandle)break}}//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"))
Create new window (or) new tab and switch
Creates a new window (or) tab and will focus the new window or tab on screen.
You don’t need to switch to work with the new window (or) tab. If you have more than two windows
(or) tabs opened other than the new window, you can loop over both windows or tabs that WebDriver can see,
and switch to the one which is not the original.
Note: This feature works with Selenium 4 and later versions.
// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB);// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW);
# Opens a new tab and switches to new tabdriver.switch_to.new_window('tab')# Opens a new window and switches to new windowdriver.switch_to.new_window('window')
// Opens a new tab and switches to new tabdriver.SwitchTo().NewWindow(WindowType.Tab)// Opens a new window and switches to new windowdriver.SwitchTo().NewWindow(WindowType.Window)
# Note: The new_window in ruby only opens a new tab (or) Window and will not switch automatically# The user has to switch to new tab (or) new window# Opens a new tab and switches to new tabdriver.manage.new_window(:tab)# Opens a new window and switches to new windowdriver.manage.new_window(:window)
// Opens a new tab and switches to new tab
awaitdriver.switchTo().newWindow('tab');// Opens a new window and switches to new window
awaitdriver.switchTo().newWindow('window');
// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB)// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW)
Closing a window or tab
When you are finished with a window or tab and it is not the
last window or tab open in your browser, you should close it and switch
back to the window you were using previously. Assuming you followed the
code sample in the previous section you will have the previous window
handle stored in a variable. Put this together and you will get:
//Close the tab or window
driver.close();//Switch back to the old tab or window
driver.switchTo().window(originalWindow);
#Close the tab or windowdriver.close()#Switch back to the old tab or windowdriver.switch_to.window(original_window)
//Close the tab or windowdriver.Close();//Switch back to the old tab or windowdriver.SwitchTo().Window(originalWindow);
#Close the tab or windowdriver.close#Switch back to the old tab or windowdriver.switch_to.windoworiginal_window
//Close the tab or window
awaitdriver.close();//Switch back to the old tab or window
awaitdriver.switchTo().window(originalWindow);
//Close the tab or window
driver.close()//Switch back to the old tab or window
driver.switchTo().window(originalWindow)
Forgetting to switch back to another window handle after closing a
window will leave WebDriver executing on the now closed page, and will
trigger a No Such Window Exception. You must switch
back to a valid window handle in order to continue execution.
Quitting the browser at the end of a session
When you are finished with the browser session you should call quit,
instead of close:
driver.quit();
driver.quit()
driver.Quit();
driver.quit
awaitdriver.quit();
driver.quit()
Quit will:
Close all the windows and tabs associated with that WebDriver
session
Close the browser process
Close the background driver process
Notify Selenium Grid that the browser is no longer in use so it can
be used by another session (if you are using Selenium Grid)
Failure to call quit will leave extra background processes and ports
running on your machine which could cause you problems later.
Some test frameworks offer methods and annotations which you can hook
into to tear down at the end of a test.
/**
* Example using JUnit
* https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
*/@AfterAllpublicstaticvoidtearDown(){driver.quit();}
/*
Example using Visual Studio's UnitTesting
https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.unittesting.aspx
*/[TestCleanup]publicvoidTearDown(){driver.Quit();}
/**
* Example using Mocha
* https://mochajs.org/#hooks
*/after('Tear down',asyncfunction(){awaitdriver.quit();});
/**
* Example using JUnit
* https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
*/@AfterAllfuntearDown(){driver.quit()}
If not running WebDriver in a test context, you may consider using
try / finally which is offered by most languages so that an exception
will still clean up the WebDriver session.
Python’s WebDriver now supports the python context manager,
which when using the with keyword can automatically quit the driver at
the end of execution.
withwebdriver.Firefox()asdriver:# WebDriver code here...# WebDriver will automatically quit after indentation
Window management
Screen resolution can impact how your web application renders, so
WebDriver provides mechanisms for moving and resizing the browser
window.
Get window size
Fetches the size of the browser window in pixels.
//Access each dimension individually
intwidth=driver.manage().window().getSize().getWidth();intheight=driver.manage().window().getSize().getHeight();//Or store the dimensions and query them later
Dimensionsize=driver.manage().window().getSize();intwidth1=size.getWidth();intheight1=size.getHeight();
# Access each dimension individuallywidth=driver.get_window_size().get("width")height=driver.get_window_size().get("height")# Or store the dimensions and query them latersize=driver.get_window_size()width1=size.get("width")height1=size.get("height")
//Access each dimension individuallyintwidth=driver.Manage().Window.Size.Width;intheight=driver.Manage().Window.Size.Height;//Or store the dimensions and query them laterSystem.Drawing.Sizesize=driver.Manage().Window.Size;intwidth1=size.Width;intheight1=size.Height;
# Access each dimension individuallywidth=driver.manage.window.size.widthheight=driver.manage.window.size.height# Or store the dimensions and query them latersize=driver.manage.window.sizewidth1=size.widthheight1=size.height
// Access each dimension individually
const{width,height}=awaitdriver.manage().window().getRect();// Or store the dimensions and query them later
constrect=awaitdriver.manage().window().getRect();constwidth1=rect.width;constheight1=rect.height;
//Access each dimension individually
valwidth=driver.manage().window().size.widthvalheight=driver.manage().window().size.height//Or store the dimensions and query them later
valsize=driver.manage().window().sizevalwidth1=size.widthvalheight1=size.height
Fetches the coordinates of the top left coordinate of the browser window.
// Access each dimension individually
intx=driver.manage().window().getPosition().getX();inty=driver.manage().window().getPosition().getY();// Or store the dimensions and query them later
Pointposition=driver.manage().window().getPosition();intx1=position.getX();inty1=position.getY();
# Access each dimension individuallyx=driver.get_window_position().get('x')y=driver.get_window_position().get('y')# Or store the dimensions and query them laterposition=driver.get_window_position()x1=position.get('x')y1=position.get('y')
//Access each dimension individuallyintx=driver.Manage().Window.Position.X;inty=driver.Manage().Window.Position.Y;//Or store the dimensions and query them laterPointposition=driver.Manage().Window.Position;intx1=position.X;inty1=position.Y;
#Access each dimension individuallyx=driver.manage.window.position.xy=driver.manage.window.position.y# Or store the dimensions and query them laterrect=driver.manage.window.rectx1=rect.xy1=rect.y
// Access each dimension individually
const{x,y}=awaitdriver.manage().window().getRect();// Or store the dimensions and query them later
constrect=awaitdriver.manage().window().getRect();constx1=rect.x;consty1=rect.y;
// Access each dimension individually
valx=driver.manage().window().position.xvaly=driver.manage().window().position.y// Or store the dimensions and query them later
valposition=driver.manage().window().positionvalx1=position.xvaly1=position.y
Set window position
Moves the window to the chosen position.
// Move the window to the top left of the primary monitor
driver.manage().window().setPosition(newPoint(0,0));
# Move the window to the top left of the primary monitordriver.set_window_position(0,0)
// Move the window to the top left of the primary monitordriver.Manage().Window.Position=newPoint(0,0);
driver.manage.window.move_to(0,0)
// Move the window to the top left of the primary monitor
awaitdriver.manage().window().setRect({x:0,y:0});
// Move the window to the top left of the primary monitor
driver.manage().window().position=Point(0,0)
Maximize window
Enlarges the window. For most operating systems, the window will fill
the screen, without blocking the operating system’s own menus and
toolbars.
driver.manage().window().maximize();
driver.maximize_window()
driver.Manage().Window.Maximize();
driver.manage.window.maximize
awaitdriver.manage().window().maximize();
driver.manage().window().maximize()
Minimize window
Minimizes the window of current browsing context.
The exact behavior of this command is specific to
individual window managers.
Minimize Window typically hides the window in the system tray.
Note: This feature works with Selenium 4 and later versions.
driver.manage().window().minimize();
driver.minimize_window()
driver.Manage().Window.Minimize();
driver.manage.window.minimize
awaitdriver.manage().window().minimize();
driver.manage().window().minimize()
Fullscreen window
Fills the entire screen, similar to pressing F11 in most browsers.
driver.manage().window().fullscreen();
driver.fullscreen_window()
driver.Manage().Window.FullScreen();
driver.manage.window.full_screen
awaitdriver.manage().window().fullscreen();
driver.manage().window().fullscreen()
TakeScreenshot
Used to capture screenshot for current browsing context.
The WebDriver endpoint screenshot
returns screenshot which is encoded in Base64 format.
fromseleniumimportwebdriverdriver=webdriver.Chrome()driver.get("http://www.example.com")# Returns and base64 encoded string into imagedriver.save_screenshot('./image.png')driver.quit()
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.Support.UI;vardriver=newChromeDriver();driver.Navigate().GoToUrl("http://www.example.com");Screenshotscreenshot=(driverasITakesScreenshot).GetScreenshot();screenshot.SaveAsFile("screenshot.png",ScreenshotImageFormat.Png);// Format values are Bmp, Gif, Jpeg, Png, Tiff
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://example.com/'# Takes and Stores the screenshot in specified pathdriver.save_screenshot('./image.png')end
Used to capture screenshot of an element for current browsing context.
The WebDriver endpoint screenshot
returns screenshot which is encoded in Base64 format.
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Chrome()driver.get("http://www.example.com")ele=driver.find_element(By.CSS_SELECTOR,'h1')# Returns and base64 encoded string into imageele.screenshot('./image.png')driver.quit()
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.Support.UI;// Webdrivervardriver=newChromeDriver();driver.Navigate().GoToUrl("http://www.example.com");// Fetch element using FindElementvarwebElement=driver.FindElement(By.CssSelector("h1"));// Screenshot for the elementvarelementScreenshot=(webElementasITakesScreenshot).GetScreenshot();elementScreenshot.SaveAsFile("screenshot_of_element.png");
# Works with Selenium4-alpha7 Ruby bindings and aboverequire'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://example.com/'ele=driver.find_element(:css,'h1')# Takes and Stores the element screenshot in specified pathele.save_screenshot('./image.jpg')end
const{Builder,By}=require('selenium-webdriver');letfs=require('fs');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');letele=awaitdriver.findElement(By.css("h1"));// Captures the element screenshot
letencodedString=awaitele.takeScreenshot(true);awaitfs.writeFileSync('./image.png',encodedString,'base64');awaitdriver.quit();}())
Executes JavaScript code snippet in the
current context of a selected frame or window.
//Creating the JavascriptExecutor interface object by Type casting
JavascriptExecutorjs=(JavascriptExecutor)driver;//Button Element
WebElementbutton=driver.findElement(By.name("btnLogin"));//Executing JavaScript to click on element
js.executeScript("arguments[0].click();",button);//Get return value from script
Stringtext=(String)js.executeScript("return arguments[0].innerText",button);//Executing JavaScript directly
js.executeScript("console.log('hello world')");
# Stores the header elementheader=driver.find_element(By.CSS_SELECTOR,"h1")# Executing JavaScript to capture innerText of header elementdriver.execute_script('return arguments[0].innerText',header)
//creating Chromedriver instanceIWebDriverdriver=newChromeDriver();//Creating the JavascriptExecutor interface object by Type castingIJavaScriptExecutorjs=(IJavaScriptExecutor)driver;//Button ElementIWebElementbutton=driver.FindElement(By.Name("btnLogin"));//Executing JavaScript to click on elementjs.ExecuteScript("arguments[0].click();",button);//Get return value from scriptStringtext=(String)js.ExecuteScript("return arguments[0].innerText",button);//Executing JavaScript directlyjs.ExecuteScript("console.log('hello world')");
# Stores the header elementheader=driver.find_element(css:'h1')# Get return value from scriptresult=driver.execute_script("return arguments[0].innerText",header)# Executing JavaScript directlydriver.execute_script("alert('hello world')")
// Stores the header element
letheader=awaitdriver.findElement(By.css('h1'));// Executing JavaScript to capture innerText of header element
lettext=awaitdriver.executeScript('return arguments[0].innerText',header);
// Stores the header element
valheader=driver.findElement(By.cssSelector("h1"))// Get return value from script
valresult=driver.executeScript("return arguments[0].innerText",header)// Executing JavaScript directly
driver.executeScript("alert('hello world')")
Print Page
Prints the current page within the browser.
Note: This requires Chromium Browsers to be in headless mode
Web applications can enable a public key-based authentication mechanism known as Web Authentication to authenticate users in a passwordless manner.
Web Authentication defines APIs that allows a user to create a public-key credential and register it with an authenticator.
An authenticator can be a hardware device or a software entity that stores user’s public-key credentials and retrieves them on request.
As the name suggests, Virtual Authenticator emulates such authenticators for testing.
Virtual Authenticator Options
A Virtual Authenticatior has a set of properties.
These properties are mapped as VirtualAuthenticatorOptions in the Selenium bindings.
A low-level interface for providing virtualized device input actions to the web browser.
In addition to the high-level element interactions,
the Actions API provides granular control over
exactly what designated input devices can do. Selenium provides an interface for 3 kinds of input sources:
a key input for keyboard devices, a pointer input for a mouse, pen or touch devices,
and wheel inputs for scroll wheel devices (introduced in Selenium 4.2).
Selenium allows you to construct individual action commands assigned to specific
inputs and chain them together and call the associated perform method to execute them all at once.
Action Builder
In the move from the legacy JSON Wire Protocol to the new W3C WebDriver Protocol,
the low level building blocks of actions became especially detailed. It is extremely
powerful, but each input device has a number of ways to use it and if you need to
manage more than one device, you are responsible for ensuring proper synchronization between them.
Thankfully, you likely do not need to learn how to use the low level commands directly, since
almost everything you might want to do has been given a convenience method that combines the
lower level commands for you. These are all documented in
keyboard, mouse, pen, and wheel pages.
Pause
Pointer movements and Wheel scrolling allow the user to set a duration for the action, but sometimes you just need
to wait a beat between actions for things to work correctly.
An important thing to note is that the driver remembers the state of all the input
items throughout a session. Even if you create a new instance of an actions class, the depressed keys and
the location of the pointer will be in whatever state a previously performed action left them.
There is a special method to release all currently depressed keys and pointer buttons.
This method is implemented differently in each of the languages because
it does not get executed with the perform method.
A representation of any key input device for interacting with a web page.
There are only 2 actions that can be accomplished with a keyboard:
pressing down on a key, and releasing a pressed key.
In addition to supporting ASCII characters, each keyboard key has
a representation that can be pressed or released in designated sequences.
Keys
In addition to the keys represented by regular unicode,
unicode values have been assigned to other keyboard keys for use with Selenium.
Each language has its own way to reference these keys; the full list can be found
here.
This is a convenience method in the Actions API that combines keyDown and keyUp commands in one action.
Executing this command differs slightly from using the element method, but
primarily this gets used when needing to type multiple characters in the middle of other actions.
Here’s an example of using all of the above methods to conduct a copy / paste action.
Note that the key to use for this operation will be different depending on if it is a Mac OS or not.
This code will end up with the text: SeleniumSelenium!
A representation of any pointer device for interacting with a web page.
There are only 3 actions that can be accomplished with a mouse:
pressing down on a button, releasing a pressed button, and moving the mouse.
Selenium provides convenience methods that combine these actions in the most common ways.
Click and hold
This method combines moving the mouse to the center of an element with pressing the left mouse button.
This is useful for focusing a specific element:
There are a total of 5 defined buttons for a Mouse:
0 — Left Button (the default)
1 — Middle Button (currently unsupported)
2 — Right Button
3 — X1 (Back) Button
4 — X2 (Forward) Button
Context Click
This method combines moving to the center of an element with pressing and releasing the right mouse button (button 2).
This is otherwise known as “right-clicking”:
This method moves the mouse to the in-view center point of the element.
This is otherwise known as “hovering.”
Note that the element must be in the viewport or else the command will error.
These methods first move the mouse to the designated origin and then
by the number of pixels in the provided offset.
Note that the position of the mouse must be in the viewport or else the command will error.
Offset from Element
This method moves the mouse to the in-view center point of the element,
then moves by the provided offset.
This method moves the mouse from its current position by the offset provided by the user.
If the mouse has not previously been moved, the position will be in the upper left
corner of the viewport.
Note that the pointer position does not change when the page is scrolled.
Note that the first argument X specifies to move right when positive, while the second argument
Y specifies to move down when positive. So moveByOffset(30, -10) moves right 30 and up 10 from
the current mouse position.
A Pen is a type of pointer input that has most of the same behavior as a mouse, but can
also have event properties unique to a stylus. Additionally, while a mouse
has 5 buttons, a pen has 3 equivalent button states:
0 — Touch Contact (the default; equivalent to a left click)
2 — Barrel Button (equivalent to a right click)
5 — Eraser Button (currently unsupported by drivers)
This is the most common scenario. Unlike traditional click and send keys methods,
the actions class does not automatically scroll the target element into view,
so this method will need to be used if elements are not already inside the viewport.
This method takes a web element as the sole argument.
Regardless of whether the element is above or below the current viewscreen,
the viewport will be scrolled so the bottom of the element is at the bottom of the screen.
This is the second most common scenario for scrolling. Pass in an delta x and a delta y value for how much to scroll
in the right and down directions. Negative values represent left and up, respectively.
This scenario is effectively a combination of the above two methods.
To execute this use the “Scroll From” method, which takes 3 arguments.
The first represents the origination point, which we designate as the element,
and the second two are the delta x and delta y values.
If the element is out of the viewport,
it will be scrolled to the bottom of the screen, then the page will be scrolled by the provided
delta x and delta y values.
This scenario is used when you need to scroll only a portion of the screen, and it is outside the viewport.
Or is inside the viewport and the portion of the screen that must be scrolled
is a known offset away from a specific element.
This uses the “Scroll From” method again, and in addition to specifying the element,
an offset is specified to indicate the origin point of the scroll. The offset is
calculated from the center of the provided element.
If the element is out of the viewport,
it first will be scrolled to the bottom of the screen, then the origin of the scroll will be determined
by adding the offset to the coordinates of the center of the element, and finally
the page will be scrolled by the provided delta x and delta y values.
Note that if the offset from the center of the element falls outside of the viewport,
it will result in an exception.
Scroll from a offset of origin (element) by given amount
The final scenario is used when you need to scroll only a portion of the screen,
and it is already inside the viewport.
This uses the “Scroll From” method again, but the viewport is designated instead
of an element. An offset is specified from the upper left corner of the
current viewport. After the origin point is determined,
the page will be scrolled by the provided delta x and delta y values.
Note that if the offset from the upper left corner of the viewport falls outside of the screen,
it will result in an exception.
Selenium is working with browser vendors to create the
WebDriver BiDirectional Protocol
as a means to provide a stable, cross-browser API that uses the bidirectional
functionality useful for both browser automation generally and testing specifically.
Before now, users seeking this functionality have had to rely on CDP (Chrome DevTools Protocol)
with all of its frustrations and limitations.
The traditional WebDriver model of strict request/response commands will be supplemented
with the ability to stream events from the user agent to the controlling software via WebSockets,
better matching the evented nature of the browser DOM.
As it is not a good idea to tie your tests to a specific version of any browser, the
Selenium project recommends using WebDriver BiDi wherever possible.
While the specification is in works, the browser vendors are parallely implementing
the WebDriver BiDirectional Protocol.
Refer web-platform-tests dashboard
to see how far along the browser vendors are.
Selenium is trying to keep up with the browser vendors and has started implementing W3C BiDi APIs.
The goal is to ensure APIs are W3C compliant and uniform among the different language bindings.
However, until the specification and corresponding Selenium implementation is complete there are many useful things that
CDP offers. Selenium offers some useful helper classes that use CDP.
8.1 - BiDirectional API (CDP implementation)
The following list of APIs will be growing as the Selenium
project works through supporting real world use cases. If there
is additional functionality you’d like to see, please raise a
feature request.
Register Basic Auth
Some applications make use of browser authentication to secure pages.
With Selenium, you can automate the input of basic auth credentials whenever they arise.
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.devtools.newdriver.register(username:'username',password:'password')driver.get'<your site url>'ensuredriver.quitend
Listen to the console.log events and register callbacks to process the event.
ChromeDriverdriver=newChromeDriver();DevToolsdevTools=driver.getDevTools();devTools.createSession();devTools.send(Log.enable());devTools.addListener(Log.entryAdded(),logEntry->{System.out.println("log: "+logEntry.getText());System.out.println("level: "+logEntry.getLevel());});driver.get("http://the-internet.herokuapp.com/broken_images");// Check the terminal output for the browser console messages.
driver.quit();
importtriofromseleniumimportwebdriverfromselenium.webdriver.common.logimportLogasyncdefprintConsoleLogs():chrome_options=webdriver.ChromeOptions()driver=webdriver.Chrome()driver.get("http://www.google.com")asyncwithdriver.bidi_connection()assession:log=Log(driver,session)fromselenium.webdriver.common.bidi.consoleimportConsoleasyncwithlog.add_listener(Console.ALL)asmessages:driver.execute_script("console.log('I love cheese')")print(messages["message"])driver.quit()trio.run(printConsoleLogs)
funkotlinConsoleLogExample(){valdriver=ChromeDriver()valdevTools=driver.devToolsdevTools.createSession()vallogConsole={c:ConsoleEvent->print("Console log message is: "+c.messages)}devTools.domains.events().addConsoleListener(logConsole)driver.get("https://www.google.com")valexecutor=driverasJavascriptExecutorexecutor.executeScript("console.log('Hello World')")valinput=driver.findElement(By.name("q"))input.sendKeys("Selenium 4")input.sendKeys(Keys.RETURN)driver.quit()}
Listen to JS Exceptions
Listen to the JS Exceptions
and register callbacks to process the exception details.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;importorg.openqa.selenium.devtools.DevTools;publicvoidjsExceptionsExample(){ChromeDriverdriver=newChromeDriver();DevToolsdevTools=driver.getDevTools();devTools.createSession();List<JavascriptException>jsExceptionsList=newArrayList<>();Consumer<JavascriptException>addEntry=jsExceptionsList::add;devTools.getDomains().events().addJavascriptExceptionListener(addEntry);driver.get("<your site url>");WebElementlink2click=driver.findElement(By.linkText("<your link text>"));((JavascriptExecutor)driver).executeScript("arguments[0].setAttribute(arguments[1], arguments[2]);",link2click,"onclick","throw new Error('Hello, world!')");link2click.click();for(JavascriptExceptionjsException:jsExceptionsList){System.out.println("JS exception message: "+jsException.getMessage());System.out.println("JS exception system information: "+jsException.getSystemInformation());jsException.printStackTrace();}}
asyncdefcatchJSException():chrome_options=webdriver.ChromeOptions()driver=webdriver.Chrome()asyncwithdriver.bidi_connection()assession:driver.get("<your site url>")log=Log(driver,session)asyncwithlog.add_js_error_listener()asmessages:# Operation on the website that throws an JS errorprint(messages)driver.quit()
List<string>exceptionMessages=newList<string>();usingIJavaScriptEnginemonitor=newJavaScriptEngine(driver);monitor.JavaScriptExceptionThrown+=(sender,e)=>{exceptionMessages.Add(e.Message);};awaitmonitor.StartEventMonitoring();driver.Navigate.GoToUrl("<your site url>");IWebElementlink2click=driver.FindElement(By.LinkText("<your link text>"));((IJavaScriptExecutor)driver).ExecuteScript("arguments[0].setAttribute(arguments[1], arguments[2]);",link2click,"onclick","throw new Error('Hello, world!')");link2click.Click();foreach(stringmessageinexceptionMessages){Console.WriteLine("JS exception message: {0}",message);}
const{Builder,By}=require('selenium-webdriver');(async()=>{try{letdriver=newBuilder().forBrowser('chrome').build();constcdpConnection=awaitdriver.createCDPConnection('page')awaitdriver.onLogException(cdpConnection,function(event){console.log(event['exceptionDetails']);})awaitdriver.get('https://the-internet.herokuapp.com');constlink=awaitdriver.findElement(By.linkText('Checkboxes'));awaitdriver.executeScript("arguments[0].setAttribute(arguments[1], arguments[2]);",link,"onclick","throw new Error('Hello, world!')");awaitlink.click();awaitdriver.quit();}catch(e){console.log(e);}})()
funkotlinJsErrorListener(){valdriver=ChromeDriver()valdevTools=driver.devToolsdevTools.createSession()vallogJsError={j:JavascriptException->print("Javascript error: '"+j.localizedMessage+"'.")}devTools.domains.events().addJavascriptExceptionListener(logJsError)driver.get("https://www.google.com")vallink2click=driver.findElement(By.name("q"))(driverasJavascriptExecutor).executeScript("arguments[0].setAttribute(arguments[1], arguments[2]);",link2click,"onclick","throw new Error('Hello, world!')")link2click.click()driver.quit()}
Network Interception
If you want to capture network events coming into the browser and you want manipulate them you are able to do
it with the following examples.
While Selenium 4 provides direct access to the Chrome DevTools Protocol (CDP), it is
highly encouraged that you use the WebDriver Bidi APIs instead.
Many browsers provide “DevTools” – a set of tools that are integrated with the browser that
developers can use to debug web apps and explore the performance of their pages. Google Chrome’s
DevTools make use of a protocol called the Chrome DevTools Protocol (or “CDP” for short).
As the name suggests, this is not designed for testing, nor to have a stable API, so functionality
is highly dependent on the version of the browser.
WebDriver Bidi is the next generation of the W3C WebDriver protocol and aims to provide a stable API
implemented by all browsers, but it’s not yet complete. Until it is, Selenium provides access to
the CDP for those browsers that implement it (such as Google Chrome, or Microsoft Edge, and
Firefox), allowing you to enhance your tests in interesting ways. Some examples of what you can
do with it are given below.
Emulate Geo Location
Some applications have different features and functionalities across different
locations. Automating such applications is difficult because it is hard to emulate
the geo-locations in the browser using Selenium. But with the help of Devtools,
we can easily emulate them. Below code snippet demonstrates that.
fromseleniumimportwebdriverfromselenium.webdriver.chrome.serviceimportServicedefgeoLocationTest():driver=webdriver.Chrome()Map_coordinates=dict({"latitude":41.8781,"longitude":-87.6298,"accuracy":100})driver.execute_cdp_cmd("Emulation.setGeolocationOverride",Map_coordinates)driver.get("<your site url>")
usingSystem.Threading.Tasks;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.DevTools;// Replace the version to match the Chrome versionusingOpenQA.Selenium.DevTools.V87.Emulation;namespacedotnet_test{classProgram{publicstaticvoidMain(string[]args){GeoLocation().GetAwaiter().GetResult();}publicstaticasyncTaskGeoLocation(){ChromeDriverdriver=newChromeDriver();DevToolsSessiondevToolsSession=driver.CreateDevToolsSession();vargeoLocationOverrideCommandSettings=newSetGeolocationOverrideCommandSettings();geoLocationOverrideCommandSettings.Latitude=51.507351;geoLocationOverrideCommandSettings.Longitude=-0.127758;geoLocationOverrideCommandSettings.Accuracy=1;awaitdevToolsSession.GetVersionSpecificDomains<OpenQA.Selenium.DevTools.V87.DevToolsSessionDomains>().Emulation.SetGeolocationOverride(geoLocationOverrideCommandSettings);driver.Url="<your site url>";}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegin# Latitude and longitude of Tokyo, Japancoordinates={latitude:35.689487,longitude:139.691706,accuracy:100}driver.execute_cdp('Emulation.setGeolocationOverride',coordinates)driver.get'https://www.google.com/search?q=selenium'ensuredriver.quitend
const{By,Key,Browser}=require('selenium-webdriver');const{suite}=require('selenium-webdriver/testing');constassert=require("assert");suite(function(env){describe('Emulate geolocation',function(){letdriver;before(asyncfunction(){driver=awaitenv.builder().build();});after(()=>driver.quit());it('Emulate coordinates of Tokyo',asyncfunction(){constcdpConnection=awaitdriver.createCDPConnection('page');// Latitude and longitude of Tokyo, Japan
constcoordinates={latitude:35.689487,longitude:139.691706,accuracy:100,};awaitcdpConnection.execute("Emulation.setGeolocationOverride",coordinates);awaitdriver.get("https://kawasaki-india.com/dealer-locator/");});});},{browsers:[Browser.CHROME,Browser.FIREFOX]});
fromseleniumimportwebdriver#Replace the version to match the Chrome versionimportselenium.webdriver.common.devtools.v93asdevtoolsasyncdefgeoLocationTest():chrome_options=webdriver.ChromeOptions()driver=webdriver.Remote(command_executor='<grid-url>',options=chrome_options)asyncwithdriver.bidi_connection()assession:cdpSession=session.sessionawaitcdpSession.execute(devtools.emulation.set_geolocation_override(latitude=41.8781,longitude=-87.6298,accuracy=100))driver.get("https://my-location.org/")driver.quit()
usingSystem.Threading.Tasks;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.DevTools;// Replace the version to match the Chrome versionusingOpenQA.Selenium.DevTools.V87.Emulation;namespacedotnet_test{classProgram{publicstaticvoidMain(string[]args){GeoLocation().GetAwaiter().GetResult();}publicstaticasyncTaskGeoLocation(){ChromeOptionschromeOptions=newChromeOptions();RemoteWebDriverdriver=newRemoteWebDriver(newUri("<grid-url>"),chromeOptions);DevToolsSessiondevToolsSession=driver.CreateDevToolsSession();vargeoLocationOverrideCommandSettings=newSetGeolocationOverrideCommandSettings();geoLocationOverrideCommandSettings.Latitude=51.507351;geoLocationOverrideCommandSettings.Longitude=-0.127758;geoLocationOverrideCommandSettings.Accuracy=1;awaitdevToolsSession.GetVersionSpecificDomains<OpenQA.Selenium.DevTools.V87.DevToolsSessionDomains>().Emulation.SetGeolocationOverride(geoLocationOverrideCommandSettings);driver.Url="https://my-location.org/";}}}
driver=Selenium::WebDriver.for(:remote,:url=>"<grid-url>",:capabilities=>:chrome)begin# Latitude and longitude of Tokyo, Japancoordinates={latitude:35.689487,longitude:139.691706,accuracy:100}devToolsSession=driver.devtoolsdevToolsSession.send_cmd('Emulation.setGeolocationOverride',coordinates)driver.get'https://my-location.org/'putsresensuredriver.quitend
constwebdriver=require('selenium-webdriver');constBROWSER_NAME=webdriver.Browser.CHROME;asyncfunctiongetDriver(){returnnewwebdriver.Builder().usingServer('<grid-url>').forBrowser(BROWSER_NAME).build();}asyncfunctionexecuteCDPCommands(){letdriver=awaitgetDriver();awaitdriver.get("<your site url>");constcdpConnection=awaitdriver.createCDPConnection('page');//Latitude and longitude of Tokyo, Japan
constcoordinates={latitude:35.689487,longitude:139.691706,accuracy:100,};awaitcdpConnection.execute("Emulation.setGeolocationOverride",coordinates);awaitdriver.quit();}executeCDPCommands();
importorg.openqa.selenium.WebDriverimportorg.openqa.selenium.chrome.ChromeOptionsimportorg.openqa.selenium.devtools.HasDevTools// Replace the version to match the Chrome version
importorg.openqa.selenium.devtools.v91.emulation.Emulationimportorg.openqa.selenium.remote.Augmenterimportorg.openqa.selenium.remote.RemoteWebDriverimportjava.net.URLimportjava.util.Optionalfunmain(){valchromeOptions=ChromeOptions()vardriver:WebDriver=RemoteWebDriver(URL("<grid-url>"),chromeOptions)driver=Augmenter().augment(driver)valdevTools=(driverasHasDevTools).devToolsdevTools.createSession()devTools.send(Emulation.setGeolocationOverride(Optional.of(52.5043),Optional.of(13.4501),Optional.of(1)))driver["https://my-location.org/"]driver.quit()}
Override Device Mode
Using Selenium’s integration with CDP, one can override the current device
mode and simulate a new mode. Width, height, mobile, and deviceScaleFactor
are required parameters. Optional parameters include scale, screenWidth,
screenHeight, positionX, positionY, dontSetVisible, screenOrientation, viewport, and displayFeature.
ChromeDriverdriver=newChromeDriver();DevToolsdevTools=driver.getDevTools();devTools.createSession();// iPhone 11 Pro dimensions
devTools.send(Emulation.setDeviceMetricsOverride(375,812,50,true,Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty(),Optional.empty()));driver.get("https://selenium.dev/");driver.quit();
fromseleniumimportwebdriverdriver=webdriver.Chrome()//iPhone11Prodimensionsset_device_metrics_override=dict({"width":375,"height":812,"deviceScaleFactor":50,"mobile":True})driver.execute_cdp_cmd('Emulation.setDeviceMetricsOverride',set_device_metrics_override)driver.get("<your site url>")
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.DevTools;usingSystem.Threading.Tasks;usingOpenQA.Selenium.DevTools.V91.Emulation;usingDevToolsSessionDomains=OpenQA.Selenium.DevTools.V91.DevToolsSessionDomains;namespaceSelenium4Sample{publicclassExampleDevice{protectedIDevToolsSessionsession;protectedIWebDriverdriver;protectedDevToolsSessionDomainsdevToolsSession;publicasyncTaskDeviceModeTest(){ChromeOptionschromeOptions=newChromeOptions();//Set ChromeDriverdriver=newChromeDriver();//Get DevToolsIDevToolsdevTools=driverasIDevTools;//DevTools Sessionsession=devTools.GetDevToolsSession();vardeviceModeSetting=newSetDeviceMetricsOverrideCommandSettings();deviceModeSetting.Width=600;deviceModeSetting.Height=1000;deviceModeSetting.Mobile=true;deviceModeSetting.DeviceScaleFactor=50;awaitsession.GetVersionSpecificDomains<OpenQA.Selenium.DevTools.V91.DevToolsSessionDomains>().Emulation.SetDeviceMetricsOverride(deviceModeSetting);driver.Url="<your site url>";}}}
// File must contain the following using statementsusingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingOpenQA.Selenium.DevTools;// We must use a version-specific set of domainsusingOpenQA.Selenium.DevTools.V94.Performance;publicasyncTaskPerformanceMetricsExample(){IWebDriverdriver=newChromeDriver();IDevToolsdevTools=driverasIDevTools;DevToolsSessionsession=devTools.GetDevToolsSession();awaitsession.SendCommand<EnableCommandSettings>(newEnableCommandSettings());varmetricsResponse=awaitsession.SendCommand<GetMetricsCommandSettings,GetMetricsCommandResponse>(newGetMetricsCommandSettings());driver.Navigate().GoToUrl("http://www.google.com");driver.Quit();varmetrics=metricsResponse.Metrics;foreach(Metricmetricinmetrics){Console.WriteLine("{0} = {1}",metric.Name,metric.Value);}}
The following list of APIs will be growing as the WebDriver BiDirectional Protocol grows
and browser vendors implement the same.
Additionally, Selenium will try to support real-world use cases that internally use a combination of W3C BiDi protocol APIs.
If there is additional functionality you’d like to see, please raise a
feature request.
8.3.1 - Browsing Context
This section contains the APIs related to browsing context commands.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new window. The implementation is operating system specific.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new tab. The implementation is operating system specific.
Provides a tree of all browsing contexts descending from the parent browsing context, including the parent browsing context upto the depth value passed.
constinspector=awaitLogInspector(driver)awaitinspector.onJavascriptException(function(log){logEntry=log})awaitdriver.get('https://www.selenium.dev/selenium/web/bidi/logEntryAdded.html')awaitdriver.findElement({id:'jsException'}).click()assert.equal(logEntry.text,'Error: Not working')assert.equal(logEntry.type,'javascript')assert.equal(logEntry.level,'error')
Support classes provide optional higher level features.
The core libraries of Selenium try to be low level and non-opinionated.
The Support classes in each language provide opinionated wrappers for common interactions
that may be used to simplify some behaviors.
9.1 - Working With Colors
You will occasionally want to validate the colour of something as part of your tests;
the problem is that colour definitions on the web are not constant.
Would it not be nice if there was an easy way to compare
a HEX representation of a colour with a RGB representation of a colour,
or a RGBA representation of a colour with a HSLA representation of a colour?
Worry not. There is a solution: the Color class!
First of all, you will need to import the class:
importorg.openqa.selenium.support.Color;
fromselenium.webdriver.support.colorimportColor
// This feature is not implemented - Help us by sending a pr to implement this feature
includeSelenium::WebDriver::Support
// This feature is not implemented - Help us by sending a pr to implement this feature
importorg.openqa.selenium.support.Color
You can now start creating colour objects.
Every colour object will need to be created from a string representation of
your colour.
Supported colour representations are:
You can now safely query an element
to get its colour/background colour knowing that
any response will be correctly parsed
and converted into a valid Color object:
Select lists have special behaviors compared to other elements.
The Select object will now give you a series of commands
that allow you to interact with a <select> element.
If you are using Java or .NET make sure that you’ve properly required the support package
in your code. See the full code from GitHub in any of the examples below.
Note that this class only works for HTML elements select and option.
It is possible to design drop-downs with JavaScript overlays using div or li,
and this class will not work for those.
Types
Select methods may behave differently depending on which type of <select> element is being worked with.
Single select
This is the standard drop-down object where one and only one option may be selected.
<selectname="selectomatic"><optionselected="selected"id="non_multi_option"value="one">One</option><optionvalue="two">Two</option><optionvalue="four">Four</option><optionvalue="still learning how to count, apparently">Still learning how to count, apparently</option></select>
Multiple select
This select list allows selecting and deselecting more than one option at a time.
This only applies to <select> elements with the multiple attribute.
First locate a <select> element, then use it to initialize a Select object.
Note that as of Selenium 4.5, you can’t create a Select object if the <select> element is disabled.
Get a list of selected options in the <select> element. For a standard select list
this will only be a list with one element, for a multiple select list it can contain
zero or many elements.
The Select class provides three ways to select an option.
Note that for multiple select type Select lists, you can repeat these methods
for each element you want to select.
ThreadGuard checks that a driver is called only from the same thread that created it.
Threading issues especially when running tests in Parallel may have mysterious
and hard to diagnose errors. Using this wrapper prevents this category of errors
and will raise an exception when it happens.
The following example simulate a clash of threads:
publicclassDriverClash{//thread main (id 1) created this driver
privateWebDriverprotectedDriver=ThreadGuard.protect(newChromeDriver());static{System.setProperty("webdriver.chrome.driver","<Set path to your Chromedriver>");}//Thread-1 (id 24) is calling the same driver causing the clash to happen
Runnabler1=()->{protectedDriver.get("https://selenium.dev");};Threadthr1=newThread(r1);voidrunThreads(){thr1.start();}publicstaticvoidmain(String[]args){newDriverClash().runThreads();}}
The result shown below:
Exception in thread "Thread-1" org.openqa.selenium.WebDriverException:
Thread safety error; this instance of WebDriver was constructed
on thread main (id 1)and is being accessed by thread Thread-1 (id 24)
This is not permitted and *will* cause undefined behaviour
As seen in the example:
protectedDriver Will be created in Main thread
We use Java Runnable to spin up a new process and a new Thread to run the process
Both Thread will clash because the Main Thread does not have protectedDriver in it’s memory.
ThreadGuard.protect will throw an exception.
Note:
This does not replace the need for using ThreadLocal to manage drivers when running parallel.
10 - Troubleshooting Assistance
How to get manage WebDriver problems.
It is not always obvious the root cause of errors in Selenium.
The most common Selenium-related error is a result of poor synchronization.
Read about Waiting Strategies. If you aren’t sure if it
is a synchronization strategy you can try temporarily hard coding a large sleep
where you see the issue, and you’ll know if adding an explicit wait can help.
Note that many errors that get reported to the project are actually caused by
issues in the underlying drivers that Selenium sends the commands to. You can rule
out a driver problem by executing the command in multiple browsers.
If you have questions about how to do things, check out the Support options
for ways get assistance.
If you think you’ve found a problem with Selenium code, go ahead and file a
Bug Report
on GitHub.
10.1 - Understanding Common Errors
How to get deal with various problems in your Selenium code.
Invalid Selector Exception
CSS and XPath Selectors are sometimes difficult to get correct.
Likely Cause
The CSS or XPath selector you are trying to use has invalid characters or an invalid query.
An element goes stale when it was previously located, but can not be currently accessed.
Elements do not get relocated automatically; the driver creates a reference ID for the element and
has a particular place it expects to find it in the DOM. If it can not find the element
in the current DOM, any action using that element will result in this exception.
Common Causes
This can happen when:
You have refreshed the page, or the DOM of the page has dynamically changed.
You have navigated to a different page.
You have switched to another window or into or out of a frame or iframe.
Common Solutions
The DOM has changed
When the page is refreshed or items on the page have moved around, there is still
an element with the desired locator on the page, it is just no longer accessible
by the element object being used, and the element must be relocated before it can be used again.
This is often done in one of two ways:
Always relocate the element every time you go to use it. The likelihood of
the element going stale in the microseconds between locating and using the element
is small, though possible. The downside is that this is not the most efficient approach,
especially when running on a remote grid.
Wrap the Web Element with another object that stores the locator, and caches the
located Selenium element. When taking actions with this wrapped object, you can
attempt to use the cached object if previously located, and if it is stale, exception
can be caught, the element relocated with the stored locator, and the method re-tried.
This is more efficient, but it can cause problems if the locator you’re using
references a different element (and not the one you want) after the page has changed.
The Context has changed
Element objects are stored for a given context, so if you move to a different context —
like a different window or a different frame or iframe — the element reference will
still be valid, but will be temporarily inaccessible. In this scenario, it won’t
help to relocate the element, because it doesn’t exist in the current context.
To fix this, you need to make sure to switch back to the correct context before using the element.
The Page has changed
This scenario is when you haven’t just changed contexts, you have navigated to another page
and have destroyed the context in which the element was located.
You can’t just relocate it from the current context,
and you can’t switch back to an active context where it is valid. If this is the reason
for your error, you must both navigate back to the correct location and relocate it.
10.1.1 - Unable to Locate Driver Error
Troubleshooting missing path to driver executable.
Historically, this is the most common error beginning Selenium users get
when trying to run code for the first time:
The path to the driver executable must
be set by the webdriver.chrome.driver system property;
for more information, see https://chromedriver.chromium.org/.
The latest version can be downloaded from https://chromedriver.chromium.org/downloads
The executable chromedriver needs to be available in the path.
The file geckodriver does not exist. The driver can be downloaded at https://github.com/mozilla/geckodriver/releases"
Unable to locate the chromedriver executable;
Likely cause
Through WebDriver, Selenium supports all major browsers.
In order to drive the requested browser, Selenium needs to
send commands to it via an executable driver.
This error means the necessary driver could not be
found by any of the means Selenium attempts to use.
Possible solutions
There are several ways to ensure Selenium gets the driver it needs.
Use the latest version of Selenium
As of Selenium 4.6, Selenium downloads the correct driver for you.
You shouldn’t need to do anything. If you are using the latest version
of Selenium and you are getting an error,
please turn on logging
and file a bug report with that information.
If you want to read more information about how Selenium manages driver downloads for you,
you can read about the Selenium Manager.
This is a flexible option to change location of drivers without having to update your code,
and will work on multiple machines without requiring that each machine put the
drivers in the same place.
You can either place the drivers in a directory that is already listed in PATH,
or you can place them in a directory and add it to PATH.
To see what directories are already on PATH, open a Terminal and execute:
echo$PATH
If the location to your driver is not already in a directory listed,
you can add a new directory to PATH:
You can test if it has been added correctly by checking the version of the driver:
chromedriver --version
To see what directories are already on PATH, open a Command Prompt and execute:
echo %PATH%
If the location to your driver is not already in a directory listed,
you can add a new directory to PATH:
setx PATH "%PATH%;C:\WebDriver\bin"
You can test if it has been added correctly by checking the version of the driver:
chromedriver.exe --version
Specify the location of the driver
If you cannot upgrade to the latest version of Selenium, you
do not want Selenium to download drivers for you, and you can’t figure
out the environment variables, you can specify the location of the driver in the Service object.
Specifying the location in the code itself has the advantage of not needing
to figure out Environment Variables on your system, but has the drawback of
making the code less flexible.
Driver management libraries
Before Selenium managed drivers itself, other projects were created to
do so for you.
If you can’t use Selenium Manager because you are using
an older version of Selenium (please upgrade),
or need an advanced feature not yet implemented by Selenium Manager,
you might try one of these tools to keep your drivers automatically updated:
Note: The Opera driver no longer works with the latest functionality of Selenium and is currently officially unsupported.
10.2 - Logging Selenium commands
Getting information about Selenium execution.
Turning on logging is a valuable way to get extra information that might help you determine
why you might be having a problem.
Getting a logger
Java logs are typically created per class. You can work with the default logger to
work with all loggers. To filter out specific classes, see Filtering
Java Logging is not exactly straightforward, and if you are just looking for an easy way
to look at the important Selenium logs,
take a look at the Selenium Logger project
Python logs are typically created per module. You can match all submodules by referencing the top
level module. So to work with all loggers in selenium module, you can do this:
.NET does not currently have a Logging implementation
If you want to see as much debugging as possible in all the classes,
you can turn on debugging globally in Ruby by setting $DEBUG = true.
For more fine-tuned control, Ruby Selenium created its own Logger class to wrap the default Logger class.
This implementation provides some interesting additional features.
Obtain the logger directly from the #loggerclass method on the Selenium::WebDriver module:
Things get complicated when you use PyTest, though. By default, PyTest hides logging unless the test
fails. You need to set 3 things to get PyTest to display logs on passing tests.
To always output logs with PyTest you need to run with additional arguments.
First, -s to prevent PyTest from capturing the console.
Second, -p no:logging, which allows you to override the default PyTest logging settings so logs can
be displayed regardless of errors.
So you need to set these flags in your IDE, or run PyTest on command line like:
pytest -s -p no:logging
Finally, since you turned off logging in the arguments above, you now need to add configuration to
turn it back on:
logging.basicConfig(level=logging.WARN)
.NET does not currently have a Logging implementation
Ruby logger has 5 logger levels: :debug, :info, :warn, :error, :fatal.
The default is :info.
Things are logged as warnings if they are something the user needs to take action on. This is often used
for deprecations. For various reasons, Selenium project does not follow standard Semantic Versioning practices.
Our policy is to mark things as deprecated for 3 releases and then remove them, so deprecations
may be logged as warnings.
Java logs actionable content at logger level WARN
Example:
May 08, 2023 9:23:38 PM dev.selenium.troubleshooting.LoggingTest logging
WARNING: this is a warning
Python logs actionable content at logger level — WARNING
Details about deprecations are logged at this level.
Example:
WARNING selenium:test_logging.py:23 this is a warning
.NET does not currently have a Logging implementation
Ruby logs actionable content at logger level — :warn.
Details about deprecations are logged at this level.
For example:
2023-05-08 20:53:13 WARN Selenium [:example_id] this is a warning
Because these items can get annoying, we’ve provided an easy way to turn them off, see filtering section below.
Content Help
Note:
This section needs additional and/or updated content
This is the default level where Selenium logs things that users should be aware of but do not need to take actions on.
This might reference a new method or direct users to more information about something
Java logs useful information at logger level INFO
Example:
May 08, 2023 9:23:38 PM dev.selenium.troubleshooting.LoggingTest logging
INFO: this is useful information
Python logs useful information at logger level — INFO
Example:
INFO selenium:test_logging.py:22 this is useful information
.NET does not currently have a Logging implementation
Ruby logs useful information at logger level — :info.
Example:
2023-05-08 20:53:13 INFO Selenium [:example_id] this is useful information
Logs useful information at level: INFO
Content Help
Note:
This section needs additional and/or updated content
Java logging is managed on a per class level, so
instead of using the root logger (Logger.getLogger("")), set the level you want to use on a per-class
basis:
Assertions.assertTrue(fileContent.contains("this is a warning"));
.NET does not currently have a Logging implementation
Ruby’s logger allows you to opt in (“allow”) or opt out (“ignore”) of log messages based on their IDs.
Everything that Selenium logs includes an ID. You can also turn on or off all deprecation notices by
using :deprecations.
These methods accept one or more symbols or an array of symbols: