# DATA+ Technical Description

**1. Introduction**

The essential prerequisite for effective risk management in the field of finance is a reliable source of market data and, on the other hand, a device for comprehensive statistical analysis of the underlying random variables.

DATA+ is designed to meet these requirements by incorporating the following tasks:

- to automatically collect observations of specified time series from financial data providers
- to store the time series in a database for further browsing and analysis
- to produce advanced estimates of the parameters for stochastic processes and yield curves
- to create an environment for statistical and econometric analysis for corporate-wide needs

In other words, the objective is to conduct an extensive analysis of the factors which influence the stochastic future values of the portfolios of a corporation or a financial institution.

Typically, DATA+ is used by risk management experts to periodically update the parameters of stochastic processes which form the mathematical foundation for any simulation-based analysis. The program establishes the link between raw historical data on risk factors and the parameters of stochastic processes used in Monte Carlo simulation analysis.

DATA+ is implemented with Internet tools and is able to receive real-time data from standard providers of financial information such as Reuters or Bloomberg.

The software can be implemented as a stand-alone application, or it can be configured to provide the essential market data input for advanced Monte Carlo simulation systems such as VAR+ and ACES+.

**2. DATA+ methodology**

DATA+ estimates the parameters of the following stochastic processes:

- diffusion (typical for commodity prices and indexes)
- mean reversion (interest rates)
- geometric Brownian motion (stock prices and exchange rates)

where in the case of mean reversion and geometric Brownian motion, we also permit for level-dependent volatility. The parameters can be estimated by ordinary least squares (OLS), conditional maximum likelihood (ML) or exponential weighted moving average (EWMA) methods.

Special emphasis can be also given to the phenomenon of volatility clustering observed in many financial markets. Residuals obtained during the estimation procedures can be further examined for multivariate GARCH effects to account for time-dependent volatilities and correlations.

Another major task of DATA+ is to estimate an appropriate term structure model of interest rates for each particular market. For most of the outstanding maturities there are no zero-coupon bonds and therefore no zero-coupon yields are directly available. DATA+ constructs continuous yield curves from which zero-coupon yields can be read. Doing this on a daily basis, we obtain a time series for each key interest rate that can be analysed, for example, by means of the mean reversion process mentioned above.

**3. Input information**

The input for the estimation procedure of DATA+ consists of time series on:

- exchange rates
- bond prices, zero-coupon yields, swap rates
- stock prices and indexes
- other market prices, e.g. commodity prices
- any other variables quoted on information services

Each time series can be updated as frequently as required, even on a tick basis. Different observations are collected at different times. For example, the time series from Asia can be updated at different time of day than, say, the ones from Europe. Multiple data sources can be linked to DATA+. Typically, day-to-day data sources are either treasury systems or on-line data feeds such as Reuters or Bloomberg. Sometimes it can be useful to import a whole time series as a text file.

The quality of input data is important for risk management calculations. An incorrect value in a time series (a spike or an outlier) could render a set of estimates meaningless. This happens if the input data is corrupted by network or workstation problems or information provider reports false values. Sometimes the data links to information providers become outdated or invalid. DATA+ examines the input data for such cases. It can detect missing or potentially incorrect values and report them to the DATA+ administrator.

For the estimation tasks, you can define model profiles. A profile contains information about the time interval, the number of observations, the estimation method and specifies observations which are included in the analysis. For example, you might want to run daily estimations with one year data using the OLS method while regularly also run estimations with weekly data using some other estimation method.

DATA+ understands a calculation language that allows you to manipulate observations of time series or create new (random) time series. For example, you can define a basket of specific stocks or a random variable describing the behaviour of the real estate market which depends on interest rates and inflation rate. Or, you could convert the price of a security in terms of another currency. The calculation language consists of basic arithmetic operations and a large set of different mathematical functions both for numerical operations and for generating random numbers.

**4. Output information**

Estimation in DATA+ gives the following results:

- spot and forward rates
- volatility and correlation estimates
- estimated term structure of interest rates
- parameters of different stochastic processes such as
- long-run equilibrium and adjustment speed for mean reversion processes
- expected growth rate for geometric Brownian motion processes

- the analysis of GARCH effects for volatilities and correlations, if required

DATA+ also finds historical estimates of process parameters, volatilities and correlations. It can map an entire chart of historical estimates for any parameter. For example, viewing recent changes in volatility estimates can help understand recent changes in VaR statistics.

There are three alternative methods available for the estimation of the term structure of interest rates. The first method, the smoothing splines, tries to combine two different approximation objectives: to fit a smooth curve to the given data and to simultaneously keep the pricing errors as small as possible. The method employed in DATA+ is based on [Käppi 1997] and it uses a new smoothing norm, namely the square of the discontinuity in the third derivatives at the interior knot points which are located by the size of the fitting errors. The second estimation method uses a regression cubic spline to approximate the discount function as described in [McCulloch 1975]. The third estimation method employs the extended Nelson-Siegel function as specified in [Nelson and Siegel 1987].

The input information for yield curve estimation consists of bond and other fixed income instruments quotation, which can be either entered manually or fetched from real-time market data sources such as Reuters or Bloomberg.

In general, the following information is required for each instrument included in the yield curve estimation procedure:

- settlement date
- maturity date
- annualised coupon rate
- yield- or price-based quotation
- frequency of coupon payments
- daycount basis

Most of this input data can be directly extracted from links to Reuters or Bloomberg.

The whole output of DATA+ is directly available to VAR+ and ACES+ for Monte Carlo simulation analysis, but it can also be interfaced with other systems and used for other purposes.

**5. Technical specifications**

DATA+ can be used as a stand-alone product, but it can also be integrated into a common platform together with VAR+ and ACES+. The primary external interfaces are the file import and export capabilities, as well as direct database links.

DATA+ requires a dedicated Linux (RedHat 6.2 or later) server with 256 MB RAM. For on-line data, Reuters or Bloomberg or other information service, such as in-house treasury system, must be available on a workstation on the network. DATA+ is directly available for all the users in the corporate intranet via a standard web browser such as Microsoft Internet Explorer and Netscape Navigator.

**6. User benefits**

DATA+ gives you:

- fast access to risk management data
- fast access to statistical estimation tools
- simple and easy web site interface
- rigorous checks to insure data integrity
- power to operate and modify time series
- consistent day-to-day estimation with user profiles you can modify
- state-of-the-art yield curve estimation methodologies
- reports on your web browser, Excel sheet and e-mail
- automatic or interactive operation
- input for CD Financial Technology risk engines VAR+ and ACES+

**7. End note**

More specifications are provided in the user guide material.

CDFT is pleased to assist you with further questions, inquiries and written material.