Quantcast
Channel: SCN : Blog List - SAP SQL Anywhere
Viewing all 128 articles
Browse latest View live

From the Archives: Using RowGenerator

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in September of 2009, Glenn talks about the use of the built-in RowGenerator table.



Join conditions that involve only inequality conditions are rare, primarily because most joins are between tables related through referential integrity constraints. In doing some analysis this week, however, I came up with an example that illustrates a case where joins over inequalities are useful.

 

My example involved doing some analysis over project tasks that had "creation' and "completion' timestamps, akin to

 

 

CREATE TABLE projects (   project_id INTEGER NOT NULL PRIMARY KEY,   short_desc VARCHAR(255),   long_desc LONG VARCHAR,   project_status VARCHAR(20),   creation_ts TIMESTAMP NOT NULL,   completion_ts TIMESTAMP )

 

The actual schema I was querying is much more complex than this, but this simple example serves to illustrate the basic idea. What I wanted was to create a result set that, for every week, contained a count of the number of projects that were in-progress, and the number of projects that were completed in that week. Once the data is factored out week-by-week, then I could perform historical analysis on that intermediate result using some of the builtin OLAP functionality in SQL Anywhere.

 

The function DATEDIFF( WEEK, completion_ts, creation_ts ) gives the difference in weeks between the two timestamps, so that part is straightforward but for those projects that span a calendar year. Notwithstanding that complication, the more significant problem is that I wanted to generate a row for every week the project was unfinished. I needed to join the projects table with something to generate the additional rows, but what?

 

SQL Anywhere databases contain a table named RowGenerator precisely for this purpose; it's a single-column table (row_num) that contains 255 rows with values starting from one. To generate the result set I needed, here's the query:

 

 

SELECT p.project_id, p.short_desc, p.creation_ts, p.completion_ts,       (IF p.completion_ts IS NULL THEN           ABS(DATEDIFF( WEEK, NOW(), p.creation_ts ))       ELSE           ABS(DATEDIFF( WEEK, p.completion_ts, p.creation_ts ))       ENDIF ) AS weeks_outstanding,       (IF p.project_status != 'Complete' OR weeks_outstanding = 0 OR weeks_outstanding > week_number THEN 1 ELSE 0 ENDIF) AS incomplete_projects,       (IF p.completion_ts IS NOT NULL AND (weeks_outstanding = 0 OR weeks_outstanding = week_number) THEN 1 ELSE 0 ENDIF) AS completed_projects,       (IF weeks_outstanding = 0 THEN           DATEPART( YEAR, p.creation_ts )       ELSE           DATEPART( YEAR, DATEADD( WEEK, RG.week_number, p.creation_ts) )       ENDIF) AS calendar_year,       (IF weeks_outstanding = 0 THEN           DATEPART( WEEK, p.creation_ts )       ELSE           DATEPART( WEEK, DATEADD( WEEK, RG.week_number, p.creation_ts) )       ENDIF) AS calendar_week
 FROM     ( SELECT (row_num - 1) AS week_number FROM RowGenerator) AS RG,     projects p
 WHERE     weeks_outstanding >= RG.week_number

 

The query joins the builtin RowGenerator table to the projects table based on the weeks_outstanding value. Hence, for each week a project is incomplete, a row will be generated in the output, including for those projects that are created and completed in the same week (where weeks_outstanding would be zero). Using the DATEPART function with WEEK means that up to 54 weeks in a year are possible, because DATEPART defines a week to begin on a Sunday.

 

Once I have this result set, I can then embed it in a derived table and, for example, sum the number of open and completed projects by calendar week in a straightforward way.

 

The correctness of the solution depends on one factor: that there be no projects that take more than 255 weeks to complete, because otherwise there are insufficient rows in the RowGenerator table to generate the required number of rows. Should that be a problem, SQL Anywhere provides another row generator mechanism: the sa_rowgenerator() system procedure. The sa_rowgenerator() procedure takes three parameters: the starting value, the end value, and the step increment (default is 1). Joining sa_rowgenerator() to the projects table is identical to using the RowGenerator base table, since SQL Anywhere supports table functions (procedures in the FROM clause).     


From the Archives: Differences between jConnect and the iAnywhere JDBC driver - part un

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in March of 2009, Glenn talks about the differences between JConnect and the SQL Anywhere JDBC driver.  While I have left in the references to specific supported versions of JConnect, JDBC and SQL Anywhere for historical reference, I do want to note that versions 12 and 16 of the SQL Anywhere JDBC driver now support JDBC 4.0.  The latest version of JConnect is 7.07 and also supports JDBC 4.0.



Lately I've been getting some questions about the differences between Sybase jConnect and the iAnywhere JDBC driver, so I thought I'd put some points down here. Eventually I'll turn this content into a Sybase technical document, but here it is for now.

 

In a nutshell

A brief summary of the differences between the two JDBC implementations are as follows:

  • jConnect is a Type 4 JDBC driver, while the iAnywhere JDBC driver is Type 1, and relies on the existence of a properly-installed ODBC driver.
  • jConnect uses the Sybase Tabular Data Stream (TDS) wire protocol, while the iAnywhere JDBC driver uses the native and proprietary SQL Anywhere application-level protocol called CMDSEQ.
  • SQL Anywhere supports TDS only over the TCP/IP network protocol. In contrast, the SQL Anywhere-specific CMDSEQ protocol supports both TCP/IP as well as an efficient shared-memory protocol designed for same-computer communication.
  • It is assumed that applications connecting to SQL Anywhere via jConnect, and hence using TDS, are desirous of Sybase ASE behaviour; hence the SQL Anywhere implementation of jConnect support sets a number of ASE compatibility settings immediately after the application connects to the database (see below).
  • A variety of jConnect behaviours stem from its native support for Sybase ASE. This is particularly true with jConnect's support for cursors and specific data types (more on this in a subsequent article).

 

JDBC driver types

Sybase jConnect is a Type 4 JDBC driver which is entirely Java-based. In contrast, the iAnywhere JDBC driver is a Type 1 driver, as it relies on its underlying (non-Java) ODBC driver to actually communicate with the SQL Anywhere server. Both the iAnywhere JDBC driver and jConnect 6.0.5 are JDBC 4.0 compliant.

 

SQL Anywhere Version 11 supports only JDBC 3.0 (JDK 1.4 and up). SQL Anywhere Version 10 supported both JDBC 2.0 and 3.0; the jodbc.jar that shipped with Version 10 contained support for both, but which version was used depended on the driver name:

  • iAnywhere.ml.jdbcodbc.idriverprovides JDBC 2.0 support; while
  • iAnywhere.ml.jdbcodbc.jdbc3.idriver provides JDBC 3.0 support.

 

If you plan to utilize a Java application on a Windows CE platform, the older jConnect 5.5 may be your best option. On Windows CE, the best Java VM to install is IBM's J9 VMwith the "Mobile Database Option", which provides JDBC 2.0 support (from JDK Version 1.3). However, as only jConnect 5.5 is JDBC 2.0-compliant---jConnect 6.05 and SQL Anywhere 11 require JDBC 3.0---then you should install jConnect 5.5 (jconn2.jar) on the device.

 

Using jConnect with SQL Anywhere on any platform requires installation of jConnect's JDBC metadata schema in the database. By default, SQL Anywhere databases are created with jConnect metadata support; you can explicitly add it using dbupgrad -j. jConnectis downloadable from sybase.com. Make sure you specify "all months" for the display of possible versions/EBFs to download:

 

Connection options

When connecting via JConnect, the SQL Anywhere server automatically resets the values of several option settings to permit the server to emulate Sybase ASE behaviour; this occurs in the sp_tsql_environment system procedure, which executes the following SET OPTION statements for the connection:

 

SET TEMPORARY OPTION allow_nulls_by_default='Off';
SET TEMPORARY OPTION ansi_blanks='On';
SET TEMPORARY OPTION ansinull='Off';
SET TEMPORARY OPTION chained='Off';
SET TEMPORARY OPTION close_on_endtrans='Off';
SET TEMPORARY OPTION date_format='YYYY-MM-DD';
SET TEMPORARY OPTION date_order='MDY';
SET TEMPORARY OPTION escape_character='Off';
SET TEMPORARY OPTION isolation_level='1';
SET TEMPORARY OPTION on_tsql_error='Continue';
SET TEMPORARY OPTION quoted_identifier='Off';
SET TEMPORARY OPTION time_format='HH:NN:SS.SSS';
SET TEMPORARY OPTION timestamp_format='YYYY-MM-DD HH:NN:SS.SSS';
SET TEMPORARY OPTION tsql_variables='On';

 

Note that the original or default values for the connection are not retained. Also note that the default isolation level for TDS connections is "1" (READ COMMITTED). Older versions of both DBISQL and Sybase Central undo these temporary option settings after they have connected to the database server to retain SQL Anywhere semantics as much as possible, so one may not notice any difference. Newer versions of the DBISQL and Sybase Central admin tools (SQL Anywhere Version 10 and up) no longer support connecting via jConnect.

 

Next: Semantic and performance differences between jConnect and the iAnywhere JDBC driver.

From the Archives: Differences between jConnect and the iAnywhere JDBC driver - part deux

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in October of 2009, Glenn talks more about the differences between the JConnect and SQL Anywhere JDBC drivers.

 

In a previous post I briefly described some of the differences between the jConnect JDBC driver and the iAnywhere JDBC driver when used with SQL Anywhere. A whitepaper on sybase.com summarizes the architectural differences between the two drivers.

 

Both the jConnect and iAnywhere drivers support JDBC 3.0. jConnect is a "pure Java" solution (termed a Type 4 JDBC driver), while the iAnywhere driver is a Type 1 driver because of its reliance on the SQL Anywhere ODBC driver which must be properly installed.  It is sometimes argued that a "pure Java" solution is better/faster/more robust; hence, on paper, jConnect should be "better" than the iAnywhere Type 1 driver. However, if one looks more closely, the significant differences between the two solutions are (1) memory management, (2) the use of the TDS wire protocol, and (3) differences in semantics. We look at each of these in turn.

 

Memory management

With a pure Java solution:

  • All objects are managed by the Java virtual machine.
  • Java garbage collection cleans things up automatically. The application programmer does not have to worry about objects sticking around indefinitely, memory leaks or objects disappearing while still in use.

Unfortunately, the weakness of pure Java solutions is the same: memory management. The application programmer has little or no real control over the lifespan of an object. Moreover, the programmer has no effective control over garbage collection; garbage collection can kick in at critical times, resulting in random and unreproducible performance problems.

 

With a hybrid solution such as the iAnywhere JDBC driver, the most important advantage is memory management:

  • The programmer retains full control over non-Java objects.
  • Garbage collection can be prevented or postponed by non-Java references to Java objects.

However, as with pure Java, the greatest disadvantage of a hybrid solution is - you guessed it - alsomemory management. In the hybrid case, non-Java objects need to be managed explicitly; program errors lead to memory leaks at best, and memory corruption or GPFs at worst. Moreover, if Java object references are held too long, Java garbage collection won't kick in.

 

CMDSEQ versus TDS

jConnect uses Sybase ASE's native wire protocol, the Tabular Data Stream (TDS) protocol, whereas the iAnywhere JDBC driver uses SQL Anywhere's native wire protocol which is called Command Sequence (CMDSEQ). There are both semantic and performance differences between the use of the two protocols; each has advantages and disadvantages.

 

An advantage of TDS is that it supports "fire hose" cursors. That is, with a single TDS language command token one can instruct the server to execute a set of statements, describe all the result sets, and return all the results in one go to the client. In situations where the application desires all of the rows of a result set(s), a fire hose cursor does offer a performance advantage by reducing the amount of round-trip traffic over the wire. However, this comes at a cost: it is the client that is responsible for caching the result set, and the client that must implement cursor scrolling. The TDS client supports a  "window" of rows in the result set that the Java application can scroll through - both forwards and backwards. However, should scrolling occur to a range of rows outside this window the entire request is re-issued to the server - necessary since prior rows outside the "window" have been lost. Hence, in this model with scrollable cursors, cursor sensitivity semantics are impossible to guarantee. Moreover, with very large result sets the communication stream can become blocked if the client cannot process the returned rows quickly enough, which can, in turn, block the server.

 

While fire-hose cursors give an advantage to jConnect connections under the right circumstances, recently-added support for adaptive prefetching in CMDSEQ (see below) mitigates this advantage. Moreover, there are several additional features supported by the iAnywhere JDBC driver that provide advantages over jConnect. These include:

  • TDS is limited to TCP/IP, even for local connections, while the iAnywhere JDBC driver can use either TCP/IP or shared memory. This means that when using jConnect applications cannot automatically start and stop local database servers, since this is supported only with shared memory connections.
  • Strong encryption support, including RSA, RSA-FIPS, and ECC encryption technologies.
  • Complete server-side cursor support. jConnect does not support server-side cursors; it implements a cursor on the client-side by retrieving the entire result set across the network, even if the client will only use a small number of rows from that result set. When using jConnect, application programmers must be careful to write their SQL queries to return the smallest result set necessary, rather than rely on FETCHing only the first few rows, since the entire result set is sent to the client with each SQL request.
  • Complete AppInfo support. jConnect truncates AppInfo details.
  • Integrated logins on Windows platforms.
  • Richer batch SQL statement support - for example, wide (batch) inserts and wide fetches. With SQL Anywhere, jConnect only fully supports wide fetches. jConnect does support wide inserts from the application, which reduces the amount of network traffic required, but on the server TDS wide inserts are simulated, with each row initiating a separate INSERT statement. In contrast, the iAnywhere JDBC driver efficiently supports both wide inserts and wide fetches.

 

Adaptive prefetching with CMDSEQ

SQL Anywhere version 11 introduced adaptive prefetch as a variant of prefetch behaviour with CMDSEQ connections. Prefetch is designed to reduce communication in a client-server environment by transferring sets of rows to the client in advance of a FETCHrequest, and is enabled by default. Prefetching can be disabled outright by specifying the DisableMultiRowFetch connection parameter, or by setting the Prefetch connection option to OFF. Prefetch is turned off for cursors declared with sensitive value semantics.

 

With adaptive prefetching, a SQL Anywhere CMDSEQ client will automatically adjust the number of rows that are prefetched - increasing or decreasing - depending on application behaviour. A hard limit on the maximum number of rows that will be prefetched is 1000. Adaptive prefetching is also controlled by number of rows the application can FETCHin one elapsed second. Adaptive prefetching is enabled for cursors for which all of the following are true:

  • ODBC and OLE DB: FORWARD ONLY, READ ONLY (default) cursor types; ESQL: DYNAMIC SCROLL (default), NO SCROLL and INSENSITIVE cursor types; all ADO.Net cursors
  • only FETCH NEXT operations are done (no absolute, relative or backwards fetching)
  • the application does not change the host variable type between fetches and does not use GET DATA to get column data in chunks (but using oneGET DATA to retrieve the value is fine).

 

jConnect semantics

In addition to the automatic setting of connection options to ASE-equivalent settings upon connecting with jConnect - described in my previous post- there are other semantic differences with jConnect. They include:

  • The TDS protocol does not support dates or timestamps prior to January 1, 1753.
  • Fixed-length CHAR and BINARY values are automatically padded upon retrieval from blank-padded databases.
  • With older versions of jConnect, empty string values - strings of length zero - are returned to the application as a string with a single blank in it. This is because earlier versions of TDS did not distinguish between an empty string and the NULL value.

 

If a JDBC application wanted to use jConnect but not want Sybase ASE-like behaviour, then the application would have to:

  • Revert the connection option settings issued by the sp_tsql_environment() system procedure by setting these options temporarily immediately after connecting.
  • Set the connection option RETURN_DATE_TIME_AS_STRING to ON in order to get SQL Anywhere to always return DATE/TIME/TIMESTAMP values as strings. This is to overcome the inability of TDS to handle dates prior to January 1, 1753.
  • Set the jConnect option "dynamic prepare" to TRUE to make sure prepared statements are not re-PREPAREd every time they are used.
  • Set a cursor name for each statement in order to force jConnect to use TDS cursors instead of fire-hose cursors. Note that with SQL Anywhere, jConnect will still cache result sets on the client regardless of which cursor type is used.
  • Set the fetch size explicitly on every statement in order to get jConnect to mimic CMDSEQ prefetch behaviour.
  • For older versions of jConnect:
    • Handle 'single-blank strings' as empty strings.
    • Refrain from using unsigned data types, since unsigned values are not supported with older jConnect releases.

 

In a subsequent post I'll outline performance differences between the jConnect and iAnywhere drivers. In our experience with customer applications, most applications benefit from a significant performance boost by switching to the iAnywhere JDBC driver, occasionally up to a factor of two, depending on the nature of the application and the precise sequence of JDBC API calls issued by the application.

 

My thanks to colleague Karim Khamis for providing me with the background for this article.

SQL Anywhere QA testing

$
0
0

SQLAnywhereQA_Lab - small.jpg

As seen last week in the SQL Anywhere QA lab.   I wonder what it could mean?

Using Amazon's Web Services SDK Inside SQL Anywhere

$
0
0

One of my colleagues, Joyce Liu, recently posted a document that explains how to create a stored procedure that automatically copies a database backup to Amazon S3.  If you haven't seen this document, here's its URL: http://scn.sap.com/docs/DOC-55978.

 

Cloud computing is a hot topic these days and that was part of the motivation into writing the document.  You can always write a batch file that uploads any file to Amazon S3, but our approach allowed us to write that logic inside the database, as well as taking advantage of Maintenance Plans to simplify the entire backup process.

 

The other reason behind writing that document is to highlight SQL Anywhere's external runtime environment.  It allows developers to execute code outside the database server, provided that code is written in a supported external environment programming language (.NET, Java, PHP, Perl or C/C++).  That code is executed by a SQL stored procedure or function, providing a lot of versatility in the type of applications you can implement.  In Joyce's backup document, she was able to re-use an existing Amazon Web Services SDK assembly to create a small C# library that SQL Anywhere can call.

 

Take a look at that document and see if you can come up with other examples that take advantage of this feature.  Feel free to post your ideas in the comments section below!

From the Archives: Using RowGenerator - part deux

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in October of 2009, Glenn talks about how one can simulate the behaviour of the RowGenerator table and the sa_rowgenerator(...) function using standard SQL.



After posting my recent article on the use of the RowGenerator system table, I received a welcome email from Jan-Eike Michels of IBM who, like me, sits on the DM32.2 committee for INCITSas the IBM representative for the SQL Standard:

Hi Glenn,

 

Just stumbled across your blog about the RowGenerator (http://scn.sap.com/community/sql-anywhere/blog/2014/06/25/using-rowgenerator) . I don't know whether iAnywhere supports the WITH clause but (since the standard does) you could use that one as well (similar to your sa_rowgenerator procedure):

 

with dummy (counter) as    (select counter from table(values (1)) as x(counter) union all    select counter + 1 from dummy where counter < 1000 ) select counter from dummy



would return 1000 rows.

 

I welcomed Jan-Eike's contribution because, as he quite rightly points out, it is straightforward to generate a set of identifiers recursively using the SQL standard's common table expression syntax, in this case using the recursive UNION construction.

 

 

One can use Jan-Eike's example almost verbatim in SQL Anywhere. The issues with Jan-Eike's SQL query are:

  • In SQL Anywhere, one must include the RECURSIVE keyword when specifying a recursive query;
  • SQL Anywhere servers do not recognize the TABLE keyword; and
  • SQL Anywhere already contains a (real) table, DUMMY, that generates a single-row, single-column result set.

 

So here is a version of Jan-Eike's example that generates the values between 1 and 10 in SQL Anywhere:

 

 

WITH RECURSIVE foo(counter) AS  ( SELECT 1 FROM DUMMY    UNION ALL  SELECT counter + 1 FROM foo WHERE counter < 10 )
SELECT * FROM foo

 

that defines the common table expression "foo" (instead of "dummy") and generates the specified values. The graphical plan for this query is as follows:

recursive_row_generator.png

 

Some points to mention:

  • Specifying a larger number of values - and hence a deeper level of recursion - may require setting the MAX_RECURSIVE_ITERATIONS connection option to a higher value.
  • Jan-Eike's example generates  a sequential set of values, equivalent to what the RowGenerator system table or the sa_rowgenerator() system procedure generates. However, one could modify this query to generate a non-contiguous sequence of any values desired, simply by rewriting the SELECT list expressions in the common table expression.
  • Finally, while this recursive version does have utility, the RowGenerator system table may still be a better approach. The advantage of RowGenerator is that it is a (static) base table; hence the query optimizer is much better able to estimate the cardinality of intermediate results when RowGenerator is used within a complex query than when a common table expression is used.

From the Archives: Disk failures in the real world

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in October of 2009, Glenn talks about traditional hard drive reliability. An ars technica article from earlier this year indicates that not much has changed in terms of failure rates since this post was written.  Of course, taking regular backups, and testing recovery scenarios in order to guard against disk failure is critical if you care about your data.

 

One thing that the sheer scale of the computing landscape has contributed to the field of Computer Science is the opportunity to study these systems statistically - and in particular to prove or disprove various aspects of hardware and software reliability.

 

With respect to disk drives, several large studies of disk drive reliability [2,3,4,7] have been published in the last few years. In particular, the study done at Google [4] showed a steep increase in failure rates - to between 6 and 10 percent - once a drive passed three years of usage, an interesting point since many disk drive manufacturers offer three-year warranties. Their study also showed a lower correlation between heat and drive failure in later-model drives, something that James Hamilton has written about recently in the push towards using less air conditioning within data centers. Recently at FAST 2009, Alyssa Henry of Amazon [6] spoke in her conference keynote that, at Amazon, the Amazon Simple Storage (EC3) data service sees a hard disk failure rate of 3-5 percent per year across the board, though I am sure, given Google's survey results, that Amazon's failure experience is not uniformly distributed across all disk drive manufacturers. Iliadis and Hu [3] believe that the trend towards lower-cost magnetic media results in higher failure rates, a conclusion also reached [7] by Remzi Arpaci-Dusseauand his team at the University of Wisconsin in Madison. To some extent, at least, you do get what you pay for.

 

The actual failure rates reported in these studies is vastly different from the reliability metrics offered by disk drive manufacturers. Moreover, disk hardware failure is only part of the story. Previous work by Remzi Arpaci-Dusseau and his research team at Wisconsin found that transient errors with magnetic disk media were commonplace. Here is a quote from the summaryof the Linux Storage & Filesystem Workshop, held in San Jose in February 2008:

Ric Wheeler (aside: now with RedHat) introduced the perennial error-handling topic with the comment that bad sector handling had markedly improved over the "total disaster" it was in 2007. He moved on to silent data corruption and noted that the situation here was improving with data checksumming now being built into filesystems (most notably BTRFS and XFS) and emerging support for T10 DIF. The "forced unmount" topic provoked a lengthy discussion, with James Bottomley claiming that, at least from a block point of view, everything should just work (surprise ejection of USB storage was cited as the example). Ric countered that NFS still doesn't work and others pointed out that even if block I/O works, the filesystem might still not release the inodes. Ted Ts'o closed the debate by drawing attention to a yet to be presented paper at FAST '08 showing over 1,300 cases where errors were dropped or lost in the block and filesystem layers. (emphasis added)

Reference [5] below studies the lack or mis-reporting of both transient and "hard" filesystem errors across several filesystems. Here is the first paragraph of the paper's abstract:

The reliability of file systems depends in part on how well they propagate errors. We develop a static analysis technique, EDP, that analyzes how file systems and storage device drivers propagate error codes. Running our EDP analysis on all file systems and 3 major storage device drivers in Linux 2.6, we find that errors are often incorrectly propagated; 1153 calls (13%) drop an error code without handling it.

Write caching or out-of-order writes can cause additional problems. The use of EXT3 on Linux systems, in particular, can result in a corrupt filesystem upon a catastrophic hardware failure due to EXT3's lack of support for checksumming when writing to the journal - which is supported in EXT4. Arpaci-Dusseau and his research team at Wisconsin have just recently taken this error analysis to the next level [1]. They purposefully and systematically introduced errors into a MySQL database to determine the server's ability to recover from the sorts of hard and transient failures known to occur on the filesystems studied previously. Their results, coupled with the sweeping disk failure studies mentioned above, should give all DBAs reason to worry. I would encourage DBAs to review the papers below. And keep those backups handy.

 

[1] Sriram Subramanian, Yupu Zhang, Rajiv Vaidyanathan, Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Jeffrey F. Naughton (April 2010). Impact of Disk Corruption on Open-Source DBMS. In Proceedings, 2010 IEEE International Conference on Data Engineering, Long Beach, California. To appear.

 

[2] Bianca Schroeder and Garth A. Gibson (February 2007). Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?In Proceedings, 5th USENIX Conference on File and Storage Technologies, San Jose, California, pp. 1-16.

 

[3] Ilias Iliadis and Xiao-Yu Hu (June 2008). Reliability Assurance of RAID Storage Systems for a Wide Range of Latent Sector Errors. Proceedings of the International Conference on Networking, Architecture, and Storage, Chongqing, China. IEEE Computer Society, ISBN 978-0-7695-3187-8.

 

[4] Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz André Barroso (February 2007). Failure Trends in a Large Disk Drive Population. In Proceedings, 5th USENIX Conference on File and Storage Technologies, San Jose, California, pp. 1-16.

 

[5] Haryadi S. Gunawi, Cindy Rubio-González, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Ben Liblit (February 2008). EIO: Error Handling is Occasionally Correct.In Proceedings, 6th USENIX Conference on File and Storage Technologies, San Jose, California, pp. 207-222.

 

[6] Alyssa Henry (February 2009). Cloud Storage FUD (Failure, Uncertainty, and Durability). Keynote address, 7th USENIX Conference on File and Storage Technologies, San Francisco, California.

 

[7] Lakshmi N. Bairavasundaram, Garth R. Goodson, Bianca Schroeder, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau (February 2008). An Analysis of Data Corruption in the Storage Stack. In Proceedings of the 6th USENIX Symposium on File and Storage Technologies (FAST '08), San Jose, California, pp. 223-238.

From the Archives: Holistic approaches to query performance analysis

$
0
0

Despite our efforts at making relational database systems such as SQL Anywhereself-managing, self-tuning, and self-healing, there remains the need to be able to diagnose and repair performance problems. In part, this requirement is due to the overall complexity of the optimization task. Query optimization is - still - an NP-hard problem and the input to the optimization process includes various heuristics, particularly predicate selectivity estimation. However, performance problems are also due to the increasingly complex hardware environment that characterize computing today. In this article, I want to highlight two recent papers that attempt to diagnose problems of this kind.

 

The first two articles [1,2] are work jointly authored by researchers at Duke University and IBM Almaden. The papers describe DIADS, a prototype performance diagnostic tool designed to assist in the diagnosis of performance problems when a database server - the authors use Postgres as a test system - utilizes a SAN (Storage Area Network) for its disk resources. The problem with a SAN is that it is a complex, independent system consisting of logical units of disk storage ( pools or volumes) which typically services multiple application/database servers simultaneously. All too often SAN administration is done independently, forcing DBAs to treat a SAN as a "black box".

 

Borisov et al. developed a prototype diagnostic system called DIADS which uses annotated plan graphs to illustrate the use of SAN resources with specific query plan operators. Moreover, DIADS uses configuration dependency analysis and symptom signatures to drill down in the detail of specific annotated plans. DIADS has the ability to compare access plans for the same query, and computes metrics based on the deviation from the mean times for each operator. SAN monitoring data, including physical and logical configuration details, component connectivity, configuration changes over time, and DBA-defined events are included in annotated query plans, enabling DBAs to analyze the actual performance characteristics of specific plan operators together with detailed SAN statistics in a single display. DIADS contains a knowledge base as well, enabling expert-system analysis of plan operator performance degradation through the tracking of correlated and dependent plan operators, correlated operator cardinalities, and the inclusion of a symptoms database to help in the analysis of cause versus effect.

 

The second work is by Goetz Graefe, Harumi Kuno, and Janet Wiener of HP Labs. In their work, the authors study the problem of determining the level of robustness in a query execution engine - that is, the ability of a server to deliver consistent performance across a variety of unexpected run-time conditions: for example, errors in cardinality estimation or resource contention. The authors argue that query execution robustness is as important as the underlying fundamentals of the query operators themselves - and I couldn't agree more.

 

The authors' approach is to describe the robustness of a plan operator visually using plan robustness maps. These two- or three-dimensional graphs can then be used to reason about how particular execution strategies degrade as the amount of work increases or as system resources become constrained:

 

Reflecting on the visualization techniques employed here, these diagrams enable rapid verification of expected performance, testing of hypotheses, and insight into the absolute and relative performance of alternative query execution plans. Moreover, even for this very simple query, there is a plethora of query execution plans. Investigating many plans over a parameter space with multiple dimensions is possible only with efficient visualizations.

 

This work provides an interesting perspective of query execution performance that compliments other work that addresses optimization quality, or dynamic re-optimization of SQL requests on-the-fly: that is, the ability of the system's optimizer to find the optimal plan for a specific set of system parameters.

 

[1] Nedyalko Borisov, Shivnath Babu, Sandeep Uttamchandani, Ramani Routray, and Aameek Singh (January 2009). Why did my query slow down?In Proceedings, 4th Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, California.

 

[2] Nedyalko Borisov, Shivnath Babu, Sandeep Uttamchandani, Ramani Routray, and Aameek Singh (February 2009). DIADS: Addressing the “My-Problem-or-Yours” Syndrome with Integrated SAN and Database Diagnosis. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST'09), San Francisco, California.

 

[3] Goetz Graefe, Harumi Kuno, and Janet L. Wiener (January 2009). Visualizing the robustness of query execution. In Proceedings, 4th Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, California.


Importing data from Microsoft Excel

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in November of 2009, Glenn talks about importing data from Microsoft Excel into a SQL Anywhere database.


 

One way to import data from a Microsoft Excel spreadsheet into a SQL Anywhere database is via the DBISQL INPUT statement. Here's an example:

 

INPUT USING 'dsn=myExcelFile;DSN=myExcelFile'
FROM "myData" INTO "T"
CREATE TABLE ON

 

Note that this is a DBISQL statement, rather than a SQL statement that can be executed by the server. The components of this statement are as follows:

  • "myData" refers to a named matrix of rows and columns in the Excel spreadsheet, which will be used as input to the INPUT statement. In Excel Office 2007, one creates a named matrix of cells by executing the following steps:
    1. With the mouse, or using SHIFT-arrow, highlight the set of rows and columns desired within the worksheet to select them.
    2. Once highlighted, right-click on the selected rows.
    3. Scroll down to the menu item "Name a Range...." and press Enter or left-click.
    4. Type in your chosen name for this matrix of rows. We choose "myData" to correspond to the INPUT statement above.
    5. Save the modified spreadsheet.
  • DSN=. One needs to create an ODBC DSN in order for DBISQL to connect to the Microsoft Excel ODBC driver and read the rows and columns corresponding to "myData". To create the DSN:
    1. Start the Microsoft ODBC Administrator from your SQL Anywhere programs folder. Switch tabs to "System DSN", and then click "Add". Using System DSN's is important, because in some scenarios User DSN's will not be found.
    2. Select an Excel ODBC driver. On my laptop I'm still running 32-bit Windows XP - so the two Excel ODBC drivers available are:
      1. The generic .xls driver ("Microsoft Excel Driver (*.xls)"), version 4.00.6305.00, dated 4/14/2008; and
      2. The Office 2007 driver ("Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)"), version 12.00.6211.1000, dated 8/24/2007.
    3. After selecting one of the above drivers, click "Finish".
    4. You are then shown a dialog with input boxes for the various driver parameters. Enter a Data Source Name to name the DSN (we choose "myExcelFile"), and use the "Browse" button to inform the ODBC Administrator where the spreadsheet is in your filesystem. Then click "OK".
  • The syntax FROM "myData" INTO "T" tells DBISQL to create a table "T" from the data range named "myData".

 

Et voila! You have imported the selected rows and columns into a SQL Anywhere table named "T".

 

If Only It Were That Simple

The above set of steps does work. Really. It differs from the SQL Anywhere 11.0.1 documentation in that I've included the Excel 2007 steps for naming a matrix of rows and columns; the 11.0.1 documentation specifies the steps when using the Office 2003 version of Excel.

 

Aside: A missing piece of the SQL Anywhere 11.0.1 documentation is that the Microsoft ODBC driver assumes that the first row of a named set of rows and columns contains the "column names" for the data, and the ODBC driver will return those cell values in response to metadata calls for that result set. Hence, if naming a set of rows and columns to be loaded, ensure that the first row contains the names of the columns you desire for the table "T" - with the DBISQL INPUT statement, DBISQL will use the names of the columns returned by the metadata calls to the underlying ODBC driver.

 

However, there are two significant (and related) problems with importing Excel data using this method, and they both are due to the behaviour of the Microsoft Excel ODBC drivers (both of the drivers I've documented above exhibit the same behaviour). The two problems are:

  • The Microsoft ODBC driver for Excel seemingly chooses the data types for the various columns in the named area arbitrarily; and
  • Data exceptions can result because the choice of data types may not match all cell contents. Depending on precisely how the result set is FETCHed through the Excel ODBC driver, the application may (1) receive an error, or (2) values that would result in a data exception may be returned as NULL, or (3) the result set may be truncated without any notification.

 

The Excel driver's behaviour in choosing data types for each column of cells is partially explained in this 2003 Microsoft Knowledge Base article. There seems to be very little one can do to coerce the driver to choose a more generic type (such as string) when the data in a spreadsheet is dirty. It is these choices of data types that lead to the second problem.

 

The DBISQL INPUT statement causes DBISQL to open a cursor over the Excel data source using wide fetches - and upon the first data exception, the Microsoft driver returns end-of-file and the result set is effectively truncated without error. I experimented with two other JDBC-ODBC bridges - the freely-available Sun Microsystems bridge and the commercial Easysoft JDBC-ODBC bridge(available for a free trial) - and both drivers exhibited slightly different behaviour, returning the complete result set but with invalid data values substituted with NULLs, again with no indication of an error. My colleague Karim Khamis was kind enough to provide an example Java application to demonstrate the behaviour:

 

import java.io.*;
import java.sql.*;
import java.util.*;
class T
{    public static void main (String args[]) throws IOException    {  Connection    con = null;  System.out.println ( "Starting ... " );  con = connect();  if( con == null ) {     return; // exception should already have been reported  }  System.out.println ( "Connected ... " );  try {     try {  con.setAutoCommit(false);     } catch( SQLException dummy ) {     }         ResultSet rs = con.getMetaData().getColumns( null, null, "myExcelFile", null );     int colnum = 1;     while( rs.next() ) {         System.out.println( "Column " + colnum + " is named " + rs.getString(4)                    + " with type " + rs.getString(6) + " with size/prec " + rs.getString(7)                    + " with scale " + rs.getString(9) );  ++colnum;     }     rs.close();     System.out.println( "\n\n" );         Statement stmt = con.createStatement();     stmt.setFetchSize(1); // set to > 1 to enable wide fetches if supported     rs = stmt.executeQuery( "select * from myData" );     int colcount = rs.getMetaData().getColumnCount();     int rownum = 1;     while( rs.next() ) {         System.out.print( "ROW " + rownum + ": " );  ++rownum;  for( int i = 1; i < colcount; ++i ) {     System.out.print( rs.getObject(i) + " === " );  }  System.out.println( rs.getObject(colcount) );     }     rs.close();     con.close();     System.out.println( "Disconnected" );  } catch (SQLException sqe) {     printExceptions(sqe);  }    }      private static Connection connect()    {  String    driver, url;  Connection  connection;  // System.out.println( "Using Sun JDBC-ODBC bridge..." );  // driver="sun.jdbc.odbc.JdbcOdbcDriver";  // url="jdbc:odbc:myData";  System.out.println( "Using Easysoft JDBC-ODBC bridge..." );  driver="easysoft.sql.jobDriver";  url="jdbc:easysoft://localhost:8831/myData:trace=on";  try {     Class.forName( driver );     connection = DriverManager.getConnection( url, "dba", "sql" );  }  catch( Exception e ) {     System.err.println( "Error! Could not connect" );     System.err.println( e.getMessage() );     printExceptions( (SQLException)e );     connection = null;  }  return connection;    }    static private void printExceptions(SQLException sqe)    {        while (sqe != null)        {     System.out.println("Unexpected exception : " +  "SqlState: " + sqe.getSQLState()  +  " " + sqe.toString() +  ", ErrorCode: " + sqe.getErrorCode());     System.out.println( "======================================" );            sqe = sqe.getNextException();        }    }
}

The safe play here, unfortunately, is to export the data from Excel into something more amenable for loading, such as a CSV file. One can then use the various LOAD TABLEoptions to load the data, explicitly controlling the data types used for each column of T.

SQL Anywhere 16 Available on Linux ARM (Raspberry Pi)

$
0
0

In the last couple of years there has been an explosion of small, low-power, ARM-based computing devices that have hit the market. One example of these is the incredibly affordable Raspberry Pi. These ARM-based devices are excellent for embedded applications that run at the edge of a network.

 

This trend was very interesting for us because SQL Anywhere is a database which is designed to run in embedded application at the edge of a network. Sounds like a good fit, doesn't it?

 

Well, we thought so too! That is why I am happy to announce today that SAP SQL Anywhere 16 is now available on Linux ARM

 

If you want to test it out, go grab yourself a Raspberry Pi board for ~$35 and follow the steps below.

 

Pre-Requisites

  • Raspberry Pi installed with Raspbian (other Linux distributions and other ARMv6 and ARMv7 devices may work as well, but some commands may be different)
  • Internet connection to Raspberry Pi
  • Shell access to Raspberry Pi (either through SSH or connected display)

 

Getting Started with SQL Anywhere on Raspberry Pi

 

Register for the latest SAP SQL Anywhere 16 Developer Edition: https://global.sap.com/campaign/ne/sybase/sql_anywhere_16_download_program/index.epx?kNtBzmUK9zU

 

After registering you will be sent a registration key over email.


Open a shell on your Raspberry Pi (either through SSH or from the desktop). Download and extract the SQL Anywhere Developer Edition.

 

cd /tmp
wget http://d5d4ifzqzkhwt.cloudfront.net/sqla16developer/bin/sqla16developerlinuxarmv6.tar.gz
tar -xvf sqla16developerlinuxarmv6.tar.gz

 

Install SQL Anywhere using the key your received earlier over email (accept all of the defaults)

 

cd ga1600
sudo ./setup

 

Return to your home directory

 

cd ~

 

The SQL Anywhere executable and libraries are not added to the PATH and LD_LIBRARY_PATH environment variables automatically. You can add this to the current shell's environment by sourcing the configuration files.

 

. /opt/sqlanywhere16/bin32/sa_config.sh

 

To test out the environment, try executing the following. This should return the current version of the server (e.g. 16.0.0.1972)

 

dbsrv16 -v

 

Now that everything is setup, it it time to create a small application. Create a directory to store your application.


mkdir hellosensor
cd hellosensor

 

Next, we need to initialize an empty database. We will call this database hellosensor.db.

 

dbinit hellosensor.db

 

Start the database server (the -ud switch starts the server as a background daemon).

 

dbsrv16 -ud hellosensor.db

 

Python is the preferred development language for the Raspberry Pi, and it comes preinstalled, so that is what we will use. In order to connect to SQL Anywhere, we will need to install the SQL Anywhere Python driver. This can be installed through the Python Package Manger (pip).

 

First, make sure pip is installed.

 

sudo apt-get install python-pip

 

Then, install the SQL Anywhere Python Driver through pip.

 

sudo pip install sqlanydb

 

Create a file called helloworld.py with the following contents.

 

import sqlanydb
conn = sqlanydb.connect(uid='dba', pwd='sql', eng='hellosensor', dbn='hellosensor' )
curs = conn.cursor()
curs.execute("select 'Hello, world!'")
print "SQL Anywhere says: %s" % curs.fetchone()
curs.close()
conn.close()

 

Save the file, and test it out.

 

python helloworld.py

 

If successful, you should see this message.

 

SQL Anywhere says: Hello, world!

 

At this point, everything should be set up correctly. Let's create another application that reads fictitious sensor readings and stores them in a table. Create a file called hellosensor.py with the following contents.

 

# Import the SQL Anywhere Python driver
import sqlanydb
from time import sleep
from random import random
# Connect to the database
conn = sqlanydb.connect(uid='dba', pwd='sql', eng='hellosensor', dbn='hellosensor' )
# Create the table to hold the sensor readings (if not exists)
def setup():  curs = conn.cursor()  sql = ("CREATE TABLE IF NOT EXISTS Sensor("       "  id INTEGER PRIMARY KEY DEFAULT AUTOINCREMENT,"       "  reading FLOAT NOT NULL,"       "  timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL"       ")")  curs.execute(sql)  curs.close()
# This function would normally read a real sensor (temperture, etc)
# For this sample, it returns a random float between 0 and 100
def read_sensor():  return random() * 100
# Write the sensor value into the database
def write_reading(value):  curs = conn.cursor()  sql = "INSERT INTO Sensor(reading) VALUES (?)"  curs.execute(sql, (value,))  curs.close()  # IMPORTANT! SQL Anywhere does not commit by default  # An explicit commit is required.  conn.commit()
# Create tables
setup()
print('Press Ctrl-C to stop...')
# Read sensor every 3 seconds, and insert into database
# Run until Ctrl-C is pressed
try:  while True:    value = read_sensor()    print("Current sensor reading is %s" % (value,))    write_reading(value)    sleep(3)
except KeyboardInterrupt:  # Close the connection  conn.close()

 

(In a real application, you would probably replace read_sensor to do something with GPIO pins and the physical world such as logging a temperature sensor)

 

Save the file, and test it out.

 

python hellosensor.py

 

The output should look similar to this.

 

Press Ctrl-C to stop...
Current sensor reading is 67.012247981
Current sensor reading is 83.2578335957
Current sensor reading is 71.2944099229
Current sensor reading is 88.3533857105
Current sensor reading is 99.646246581

 

Everything is working, and you are successfully logging data to an embedded SQL Anywhere database. In the next blog post we will connect with the graphical administration tools from your development machine to view the saved data.

From the Archives: Query performance problem determination in Version 9.0.2

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in November of 2009, Glenn talks about the undocumented 9.0.2 feature called '"og expensive queries", which could be used in version 9.0.2.3113 and later to help diagnose performance issues related to specific queries.

 

In SQL Anywhere Version 10 we introduced Application Profiling, a set of graphical tools that could not only assist DBAs to diagnose performance problems but could also recommend solutions to common performance problems discovered during the analysis. Included in Application Profiling is a (cheap) mechanism to capture queries (and their access plans) that consume a significant proportion of server resources or whose elapsed time exceeds a certain threshold.

 

But, you say - you're still running SQL Anywhere Version 9.0.2. Now what?

 

If, after preliminary problem determination, you believe you are experiencing a performance problem due to a suboptimal access plan for a specific SQL query, a diagnostic feature slipstreamed into 9.0.2 build 3113 may be useful. The feature is termed "log expensive queries" and it provides a means to log SQL statement and/or their graphical plans in a SQL Anywhere request-level log.

 

 

Log Expensive Queries

The "log expensive queries" feature permits the server to dump the SQL text or graphical plans of expensive queries to the request level log.  A new server command line option, -zx [cost], turns the feature on and defines the threshold above which queries are considered "expensive".  Expensive queries can be handled in two ways: when the -zp option is also specified, detailed graphical plans will also appear in the request-level log.  Otherwise, only the SQL text of the expensive queries will be logged.

When both -zx (LogExpensiveQueries) and -zp (RememberLastPlan) are set:

  • queries whose estimated cost is greater than [cost] milliseconds will be built with full statistics-gathering gear, and a graphical plan will be dumped to the request-level log at build time;
  • queries whose estimated cost is less than [cost] milliseconds will be built with low-detail statistics gathering gear;
  • queries whose actual run-time is greater than [cost] milliseconds will have their graphical plan dumped to the request-level log when their cursor is closed. Plans will contain whatever level of statistical detail was determined for them at optimization time (as described in the first bullet point above).


When only the -zx command line switch is specified:

  • query plans will be constructed with low-detail statistics gathering gear.
  • queries whose actual run-time is greater than [cost] milliseconds will have their SQL text dumped to the request-level log when their cursor is closed.

 

The request-level log must be directed to a file for these outputs to appear - this is accomplished by using the -zo command line option or setting the RequestLogFile property. However, both SQL text and graphical plans will still appear even if the RequestLogging property is 'None'.

 

Access plans appear as a single PLAN line in the request-level log; SQL text appears as a single INFO line.  They are prefixed by a header that indicates whether they were dumped at build time or cursor completion time, and what the associated cost at that time was.  Plans dumped at build time are prefixed by [XB [cost]], plans dumped at cursor completion time are prefixed by [XC [cost]], and SQL text dumped at completion time is prefixed by [XS [cost]] (where all costs are given in seconds, the same as displayed in graphical plans).

 

The LogExpensiveQueries property can be set by the -zx [cost] command line option, or by setting the LogExpensiveQueries server option with sa_server_option(), permitting the logging of expensive queries to occur on-the-fly.

Join Us for the SQL Anywhere Technical Summit November 5-7, 2014

$
0
0

 

 

 

SAP SQL ANYWHERE TECHNICAL SUMMIT

DATE:
NOVEMBER 5–7, 2014

LOCATION:
WATERLOO, CANADA

REGISTER TODAY

LEARN MORE ABOUT
SAP SQL ANYWHERE

JOIN US FOR THE SAP SQL ANYWHERE TECHNICAL SUMMIT IN WATERLOO, CANADA, NOVEMBER 5-7

 

REGISTER TODAY

 

PLEASE JOIN US FOR THE SAP SQL ANYWHERE® TECHNICAL SUMMIT, WHICH WILL INCLUDE LECTURES, DEMOS, AND THE OPPORTUNITY TO MEET SAP SQL ANYWHERE ENGINEERS AND MEMBERS OF THE PRODUCT MANAGEMENT TEAM, AS WELL AS INTERACT WITH YOUR DEVELOPMENT PEERS. THE SUMMIT WILL TAKE PLACE IN WATERLOO, CANADA, ON NOVEMBER 5–7, 2014. THIS FREE EVENT WILL COVER A VARIETY OF TOPICS AND SESSIONS ON SAP SQL ANYWHERE DATA MANAGEMENT SOFTWARE, INCLUDING:

  • UPCOMING RELEASE DISCUSSION, INCLUDING NEW FEATURE PREVIEWS
  • SAP SQL ANYWHERE WITHIN SAP
  • SAP SQL ANYWHERE CORE CONTENT, INCLUDING DETAILED PERFORMANCE AND TUNING INFORMATION
  • MOBILINK INTERNALS, INCLUDING PERFORMANCE AND NEW DEVELOPMENTS FOR THE INTERNET OF THINGS AND MOBILE APPS
  • INTRODUCTORY MOBILINK AND ULTRALITE DISCUSSIONS FOR THOSE INTERESTED IN EXPLORING DATA SYNCHRONIZATION
  • DATA MANAGEMENT FOR OEMS WISHING TO MOVE TO CLOUD SOLUTIONS

SPACE IS LIMITED FOR THIS EVENT, SO WE ENCOURAGE YOU TO REGISTER AS SOON AS POSSIBLE. WHEN YOU REGISTER, PLEASE COMPLETE THE QUESTIONNAIRE TO PROVIDE INSIGHT INTO YOUR USE OF SAP SQL ANYWHERE AND POSSIBLE TOPICS TO COVER DURING THE EVENT.

WE LOOK FORWARD TO SEEING YOU IN NOVEMBER.

BEST REGARDS,

SAP SQL ANYWHERE PRODUCT TEAM

FOR MORE QUESTIONS CALL PAM BARROWCLIFFE +1-(519)-883-6352


 

Please note that invitations are non-transferable.
This offer is extended to you under the condition that your acceptance does not violate any applicable rules or policies within your organization. If you are unsure whether your acceptance may violate any such rules or policies, we strongly encourage you to seek advice from your ethics or compliance official. For organizations that are unable to accept all or a portion of this complimentary invitation and would like to pay for your own expenses, SAP is happy to provide a reasonable market value and an invoice or other suitable payment process. Please find out whether the participation is taxable under your local tax laws. If you have any questions, please contact your employer's HR department or your tax advisor. We would like to inform you, that SAP will bear the German income tax according § 37 b income tax law for benefit in kinds to customers.

Installing SAP SQL Anywhere and SAP IQ on the Same Machine

$
0
0

The SAP IQ (IQ) documentation states that installing SAP SQL Anywhere (SQLA) and IQ on different machines avoids potential start-up problems.  The key word here is "potential" as it doesn't necessarily mean that you'll encounter problems.  There may be a situation where you need to install SQLA and IQ on the same computer (for example, installing SAP BusinessObjects because it embeds SQLA), so understanding how those "potential start-up problems" can arise will help you avoid them.

 

SAP IQ embeds the SQLA product, meaning that the SQLA database server binaries are installed alongside the IQ binaries. On a default Windows installation for IQ16, the SQLA binaries are installed in C:\sybase\IQ-16_0\Bin32 and C:\sybase\IQ-16_0\Bin64 (one folder is for 32-bit binaries, the other for 64-bit).  Unix installations have a similar directory structure.

 

What you need to keep in mind is that the IQ installation modifies your computer's PATH environment variable to include those two directories (at the beginning of PATH), so if you were to launch a Command Prompt and type "dbisql.exe", the GUI tool Interactive SQL will launch and its startup location is C:\sybase\IQ-16_0\Bin64 (assuming a 64-bit operating system).  The same applies to all the other SQLA binaries.

 

Now, the SAP SQL Anywhere installation also modifies your computer's PATH environment variable to include its binaries directories (this time at the end of PATH).  On a default Windows installation, the SQLA16 install adds these two folders to the PATH: C:\Program Files\SQL Anywhere 16\Bin64 and C:\Program Files\SQL Anywhere 16\Bin32.

 

Understanding what each install program does will help you prevent start-up problems.  Here are some examples of how the PATH environment variable would look like (assuming 64-bit Windows machine):

 

  1. Installing SAP IQ only

    C:\sybase\IQ-16_0\Bin64;C:\sybase\IQ-16_0\Bin32
  2. Installing SAP SQL Anywhere only

    C:\Program Files\SQL Anywhere 16\Bin64;C:\Program Files\SQL Anywhere 16\Bin32
  3. Installing SAP IQ followed by SAP SQL Anywhere

    C:\sybase\IQ-16_0\Bin64;C:\sybase\IQ-16_0\Bin32;C:\Program Files\SQL Anywhere 16\Bin64;C:\Program Files\SQL Anywhere 16\Bin32
  4. Installing SAP SQL Anywhere followed by SAP IQ (notice it's the same order as the previous example)

    C:\sybase\IQ-16_0\Bin64;C:\sybase\IQ-16_0\Bin32;C:\Program Files\SQL Anywhere 16\Bin64;C:\Program Files\SQL Anywhere 16\Bin32

As you can see, when you install both IQ and SQLA, the PATH environment variable will always include IQ's entries before SQLA's.  That is the situation you want when using both products in the same machine.  To keep things simple and avoid start-up issues, I suggest you install SQLA first, and then IQ.

 

To further avoid problems, what you need to do is use the full directory structure for any binaries you want to use.  In example #4 above, running "dbisql.exe" from the Command Prompt launches Interactive SQL from the IQ installation, while running "C:\Program Files\SQL Anywhere 16\Bin64\dbisql.exe" launches Interactive SQL from the SQLA installation.  Note that this only applies to binaries that are found in both the IQ and SQLA installations (i.e. they must have the same file name).

 

There are many other products that embed SQLA, such as the aforementioned SAP BusinessObjects.  Note that typically those products DO NOT change the PATH environment variable and use the entire directory structure to call SQLA binaries.  If you find yourself in the situation where you need to install IQ and another product that embeds SQLA, simply perform the IQ installation last.  As an additional check, have a look at the PATH environment variable after installing the other product to see if any directories were added that also contain SQLA binaries.  In the case of SAP BusinessObjects, its installation will not conflict with SAP IQ because it does not set environment variables for SQLA and uses the full path to the SQLA binaries.

 

One last thing to be aware of is that the SQLA setup program creates entries on the registry (Windows) or .ini files (Unix) for its install location and language locale.  These settings can also be controlled via environment variables as described here: SQL Anywhere environment variables.  When working with SQLA, use those environment variables instead of the registry or .ini file entries.

SQL Anywhere and Pi - opening doors

$
0
0

Background

 

Accsys develops enterprise Access Control and Time & Attendance solutions. 

 

Part of our offering, is a mobile app (iOS and Android) that uses geo-fencing to allow employees to clock into work areas defined by their GPS coordinates.  Although the core of our solution revolves around biometric identification (fingerprint, face-recognition, etc), numerous use-cases (drivers, security guards, sales people) dictate the need to report for duty where the employer are not the custodian of the infrastructure and biometric readers.

 

Geo-fencing enables these employees to 'clock-in' for duty from anywhere in the world.

 

Problem statement

So we know when a person arrived at his destination with his cargo (Time & Attendance), but what if he/she needs to enter the store-room to deliver his packages (Access Control)? How can we make that same mobile app drive the relay on a magnetic lock or electronic striker?

 

SQL Anywhere and the Raspberry Pi

Our mobile app already uses JSON web services developed and hosted on the client's SQL Anywhere database.  The next step was to use that same web service call that received the request to 'clock-in' at a given GPS location, to activate the necessary relays in order to grant access to one or more secure areas.

 

After loading Raspbian OS on the Pi, we installed Sybase SQL Anywhere 16 for ARM.

 

With our web services listening for incoming requests, it was relatively straight forward to extend the logic to initiate a Python script that uses GPIO to turn on/of a GPIO pin.  From there, it was a simple matter of driving the necessary relays to activate a number of doors in close proximity.

 

Next step

It is unlikely that this prototype will find its way into a final solution.  For one, the SQL Anywhere that typically runs on a dedicated server in order to handle the load of full T&A and payroll calculations, will struggle to handle the load when running off the Pi.  But we can easily let the primary database relay the instruction to release a mag-lock to a secondary server running on a Pi next to the door that needs to be opened.

 

Costs, reliability, scalability, and support considerations are to be evaluated.

 

We'll report back to this forum as we progress.

SQL Anywhere Publishes New TPC-C Benchmark Results

$
0
0

This morning the SQL Anywhere team published for review a new TPC-C Benchmark Result, which would place SQL Anywhere at number 1 in the top price/performance results for TPC-C.

 

Of note, the benchmark:

  • used a SQL Anywhere database 750GB in size
  • used 90000 connections
  • provided a throughput rate of 112,890 tpmc (transactions per minute)
  • ran on off the shelf hardware from Dell
  • required no advanced configuration and minimal tuning of the database server parameters

Licensing SAP SQL Anywhere in Virtual Environments - Updated

$
0
0

I am providing here an update to running SAP SQL Anywhere in a virtual environments (VE).

 

Is SAP SQL Anywhere Supported running in a virtual environment in production?

Yes.  One of SQL Anywhere's strengths is its support for a wide variety of platforms, including virtual ones.  We will support our customers running SQL Anywhere on any OS in a virtual environment providing that OS is listed as supported (see link above).  To ease tracking and diagnosis for technical support issues, our support team will sometimes ask customers to reproduce issues outside of the VM environment in order to remove as many irrelevant factors as possible when diagnosing a problem. If we do run into issues that are directly caused by running in a virtual environment, we can work with our customers and the virtualization vendor to diagnose and resolve these issues.

 

 

How is SQL Anywhere licensed in a virtual environment? 

For user based licensing it makes no difference whether or not the server is running in a virtual environment.

For chip licensing, you purchase a license for each chip on which you wish to run SQL Anywhere.  You are entitled to run as many instances of SQL Anywhere as you want on each chip you have licensed, regardless of whether or not virtualization is involved.  This means you can run as many SQL Anywhere servers you want on as many VEs as you want on the chips that are licensed for SQL Anywhere.  This is different from the Sybase licensing policy, which required a separate license for each VM, essentially treating each VM as an independent piece of hardware.

 

Here are some examples that will hopefully clarify things.

Example 1

This configuration shows a server machine with 1 physical CPU chip with 1 core.  Two  virtual environments (VEs) have been created.  Two instances of SQL Anywhere are running on one VE and one instances is running on the other.
VMLicensingExample1Core
Licensing:
Chip based - Each physical chip on which you wish to run SAP SQL Anywhere must be licensed.  In this case, a single chip license must be purchased.  This permits an unlimited number of instances of the database server on that licensed chip. 
Server & Users - Each VE running a database server needs a database server license.  There are 2 VEs. Each licensed VE can permit an unlimited number of instances.  Therefore, 2 database server licenses are required, plus a user license for each user connecting to each server.

 

Example 2

This configuration shows a box with 2 dual core CPU chips.  Two VEs have been created.  Two instances of SQL Anywhere are running on one VE and one instance is running on the other.
SAVMExample2CPU2Core
Licensing:
Chip based - Each physical chip on which you wish to run SAP SQL Anywhere must be licensed.  There are 2 physical chips, therefore, 2 chip licenses are required.
Server & Users -  Each VE running a database server needs a database server license.  There are 2 VEs. Therefore, 2 database server licenses are required, plus a user license for each user connecting to each server.

 

Example 3

This configuration shows a box with 4 CPU chips.  Five VEs have been created.  Four VEs are running SQL Anywhere accessing a single CPU chip.  The Fifth VE is running SQL Anywhere accessing two CPU chips.

SAVMExample4CPU1Core

Licensing:
CPU based - Each physical chip on which you wish to run SAP SQL Anywhere must be licensed.  There are 4 physical CPU chips, therefore, 4 chip licenses are required.
Server & Seat -  Each VE running a database server needs a database server license.  There are 5 VEs. Therefore, 5 database server licenses are required, plus a user license for each user seat connecting to each server.

SQL Anywhere 2014 Technical Summit

$
0
0

Early in November, the 2014 SQL Anywhere Technical Summit was hosted in Waterloo, Ontario, Canada.

The invitation was open to any SQL Anywhere customers to attend, and we had representatives from 30 different companies join us for 2.5 days of technical discussions with the SQL Anywhere engineering team.

 

This event was a great success, allowing customers unfettered access to the SQL Anywhere development, support and product management teams to discuss a wide variety of technical issues as well as hear about upcoming release plans for SQL Anywhere.  It also gave the SQL Anywhere engineering team direct access to customers, which they found extremely useful.  This direct communication helped the developers gain a better understanding of the direction customers are heading in with our technology and some of the specific pain points they are having in their business.  All of this information feeds back into the roadmap planning for the future of SQL Anywhere.

 

Here we see our audience learning about OData and SQL Anywhere and how they work together.

DSC_0935.JPG

 

 

There were also plenty of opportunities for customers and engineering to pick each other's brains during adhoc conversations at lunch and a small evening reception that was held on site.

DSC_0931.JPG

 

Overall, all the attendees, including those from the SQL Anywhere team found this event to be a great success.  In the post-event survey, 96% of the attendees were highly satisfied with the summit and would attend the event again if another one was held in the future.

From the Archives: Select Over an Update Statement Part 2

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in May of 2009, Glenn talks about using SELECT over various DML queries, a feature that was added to SQL Anywhere in version 12This feature is worth re-iterating, as it is very useful, but lightly used feature.  For example, it provides a very simple way to retrieve the primary key of a newly inserted row as part of the actual insert statement.

eg.

SELECT pkey_col FROM (INSERT INTO mytable(col2) VALUES( 'hello')) REFERENCING (FINAL as t_final) order by 1

 

 

In a previous post in May 2009 I expressed admiration for a SQL language feature in IBM's DB2 product that permits one to use an update DML statement as a table expression in a query's FROM clause. Here is a simple example to illustrate DB2's syntax:


SELECT T_updated.*
FROM NEW TABLE ( UPDATE T SET x=7 WHERE y=6 ) AS T_updated
WHERE T_updated.pk IN ( SELECT S.fk FROM S WHERE S.z = 8 )


With this construction, it is straightforward to join the modified rows to other tables, return the modified rows to the application via a cursor, output the modified rows to a file, and so on. Without this extension, one would have to define a AFTER or BEFORE TRIGGER to copy the modified rows to another (different) table, manage the contents of that other table (and handle concurrent updaters), and execute a separate SELECT statement over the trigger-inserted table (only) after the UPDATE statement had been executed. That's a fair amount of work just to output what changes an update statement made. I am pleased to report that this language feature is now available in the SQL AnywhereVersion 12.

 

Syntax and semantics

The grammar of this feature in SQL Anywhere 12 is as follows:

 

<table primary> ::= <table or query name> [ [ AS ] <correlation name> [ ( <derived column list> ) ] ]       | other forms of table references ...       | ( <dml derived table> ) REFERENCING <result option><dml derived table> ::= <delete statement>       | <insert statement>       | <merge statement>       | <update statement><result option> ::= OLD [AS] <correlation name>    | FINAL [AS] <correlation name>    | OLD [AS] <correlation name> FINAL [AS] <correlation name>

This syntax differs from what is offered in DB2, largely for two reasons: the first is that we wanted to make the syntax simple for those applications that wanted to compute the contrasts between the new and old values of a row from an UPDATE or MERGE statement; and the second was to make it easy to join a dml-derived-table to other objects in the same request.

 

The semantics of a dml-derived-table is as follows. During DESCRIBE, the dml-derived-table is ignored. At OPEN time, the DML update statement is executed first, and the rows affected by that statement are materialized into a temporary table. The column names in the schema of that temporary table are taken directly from the table being modified - the dml-derived-table can update only one table. One can refer to these values by qualifying them with the correlation name given in the REFERENCING clause:

  • OLD columns contain the values as seen by the scan operator that finds the rows to include in the update operation.
  • FINAL columns contain the values after referential integrity checks have been made, computed and default columns filled in, and all triggers fired (excluding FOR STATEMENT AFTER triggers).

 

With these declarations come straightforward restrictions. For INSERT statements, you can only specify FINAL. For DELETE statements, you can only specify OLD. However, for UPDATE and MERGE statements, you can specify either or both correlation names, permitting easy comparison between the old and new values of any updated row. Note that while the grammar refers to these names as "correlation names", they don't refer to a separate range variable over the updated table. Rather, they are merely a syntactic device to make it simple to refer to old and new column values in the same statement, mirroring the syntax in row triggers . In other words, if you specify REFERENCING (OLD AS O FINAL AS F ), there is an implicit join predicate: O.rowid = F.rowid.

 

Example

Here is a simple example showing how one can use a dml-derived-tablein a query to report on the impact of a specific database update. Using the sample Demo database, the query answers the following question:

Update all Products with a 7% price increase. List the affected Products and their Orders which were shipped between 10 April 2000 and 21 May 2000 and whose order quantity is greater than 36.

(Aside: the Demo database contains only dates between 2000 and 2001, making the query above a bit convoluted, but I think the reader should be able to understand the general idea.) Here's what the query/update statement looks like in DBISQL:

 

select_dml_isql.PNG

 

and its graphical plan. Note the new "DML" operator in the access plan tree, signifying the generation of the temporary table containing the modified rows:

 

select_dml_plan.PNG

 

Another advantage of the use of "correlation names" - ROW TRIGGER syntax - is that one need not come up with (another) set of unique column names for the updated table. Simply qualifying the column name by the OLD or FINAL correlation name is sufficient.

 

At the moment, we are restricting dml-derived-tables to include at most one updated table, even though SQL Anywhere has long supported UPDATE statements over join table expressions. We may relax this restriction in a subsequent release.

From the Archives: Select Over an Update Statement Part 3

$
0
0

In this post, originally written by Glenn Paulley and posted to sybase.com in January of 2010, Glenn continues his discussion on using SELECT over various DML queries, a feature that was added to SQL Anywhere in version 12

 

In a previous article, I presented some examples of how one can SELECT rows from a dml-derived-table, a new SQL language feature of the SQL Anywhere 12 server. In this post, I want to briefly describe some other ways in which one can exploit dml-derived-tables to simplify applications.

 

The first thing to mention is that even though the title of this post is "SELECT over an UPDATE statement", dml-derived-tables are derived tables and hence can be used in any context one might use a derived table - including, of course, update statements (INSERT, MERGE, DELETE, UPDATE). Consequently one can effectively "nest" one UPDATE statement within another; here is an example using MERGE in combination with UPDATE:

 

CREATE TABLE modified_employees
( EmployeeID INTEGER PRIMARY KEY, Surname VARCHAR(40), GivenName VARCHAR(40) )


MERGE INTO modified_employees AS me
USING (SELECT modified_employees.EmployeeID,              modified_employees.Surname,              modified_employees.GivenName       FROM (          UPDATE Employees          SET Salary = Salary * 1.03          WHERE ManagerID = 501)            REFERENCING (FINAL as modified_employees) ) AS dt_e       on dt_e.EmployeeID = me.EmployeeID
WHEN MATCHED THEN SKIP
WHEN NOT MATCHED THEN INSERT
OPTION(optimization_level=1, isolation_level=2)

In the above example, the table "modified_employees" models a collection of Employees whose state has been altered; the MERGE statement above merges employee identifiers and names for those employees whose salary has been increased by 3% with those employees already extant in the "modified_employees" table.

 

Second, note the OPTION clause on line 15 above. Option settings that are temporarily set using the OPTION clause for this statement also apply to the nested UPDATE statement in addition to the outermost MERGE.

 

Third, since a dml-derived-table is - merely - just a derived table, multipledml-derived-tablescan exist within the same SQL statement. Below is an example that combines independent updates of both the Products and SalesOrderItems tables in the Demo example database, and then produces a result based on a join that includes these modifications:


SELECT old_products.ID, old_products.name, old_products.UnitPrice as OldPrice,       final_products.UnitPrice as NewPrice,       SalesOrders.ID as OrderID, SalesOrders.CustomerID,       old_order_items.Quantity,       old_order_items.ShipDate as OldShipDate,       final_order_items.ShipDate as RevisedShipDate
FROM
( UPDATE Products SET UnitPrice = UnitPrice * 1.07 )      REFERENCING ( OLD AS old_products FINAL AS final_products )  JOIN
( UPDATE SalesOrderItems  SET ShipDate = DATEADD( DAY, 6, ShipDate )  WHERE ShipDate BETWEEN  '2000-04-10' and '2000-05-21' )      REFERENCING ( OLD as old_order_items FINAL AS final_order_items )        ON (old_order_items.ProductID = old_products.ID)  JOIN SalesOrders ON ( SalesOrders.ID = old_order_items.ID )
WHERE old_order_items.Quantity > 36
ORDER BY old_products.ID


In cases where multiple dml-derived-tables exist, the order of execution of each update statement is implementation-defined and not guaranteed.


In a graphical plan, a dml-derived-table is indicated by a "DML" node in the access plan tree:


select_dml_join_plan.PNG


and the plan for the statement represented by the "DML" node are listed as subqueries:


select_dml_join_update_plan.PNG


Finally, the SQL Anywhere 12 GA release will feature support for embedding an update statement as a  dml-derived-table, but without materializing its result. In this case, one uses the syntax REFERENCING( NONE ) to signify that the results of the modified data are not directly available to outer blocks within the same SQL statement (although the effect of such a statement may be). Hence the SQL grammar for a dml-derived-table is as follows:

 

<table primary> ::= <table or query name> [ [ AS ] <correlation name> [ ( <derived column list> ) ] ]       | other forms of table references ...       | ( <dml derived table> ) REFERENCING <result option><dml derived table> ::= <delete statement>       | <insert statement>       | <merge statement>       | <update statement><result option> ::= OLD [AS] <correlation name>    | FINAL [AS] <correlation name>    | OLD [AS] <correlation name> FINAL [AS] <correlation name>    | NONE

 

When using REFERENCING(NONE), the result of the update statement is not materialized and hence is empty. Since the dml-derived-table is empty, one must carefully craft the nested statement to assure the intended result will be returned. The server internally uses REFERENCING(NONE) for immediate materialized view maintenance. Within an application, one can ensure that a non-empty result will be returned by placing the dml-derived-table in the null-supplying side of an outer join, eg.


SELECT 'completed' as finished, (SELECT COUNT(*) FROM Products) as product_total
FROM SYS.DUMMY LEFT OUTER JOIN    ( UPDATE Products SET UnitPrice = UnitPrice * 1.07 )      REFERENCING ( NONE ) ON 1=1


or, in a more straightforward way, as part of a query expression using one of the set operators (UNION, EXCEPT, or INTERSECT):


SELECT 'completed' as finished, (SELECT COUNT(*) FROM Products) as product_total
FROM SYS.DUMMY
UNION ALL
SELECT 'dummy', 1 /* This query specification will return the empty set */
FROM ( UPDATE Products SET UnitPrice = UnitPrice * 1.07 )      REFERENCING ( NONE )


Adding additional server versions to a SAP SQL Anywhere, on-demand cloud

$
0
0

One of the cool features of SAP SQL Anywhere, on-demand edition is that you can run multiple sever versions all within the same cloud. For example, it is possible to run both SQL Anywhere 12 and SQL Anywhere 16 servers at the same time, even on the same machine (Note that this is different from running multiple database versions at the same time, but you can do that too).

 

When you create a SQL Anywhere, on-demand edition cloud it will only have one server version available. To use a different server version, you will have to install the new server software into your cloud. This quick tutorial shows the steps required to add SQL Anywhere 12 support to an existing SP5 cloud.

 

  1. Create a SQL Anywhere, on-demand edition SP5 cloud. If you don't have a cloud, you can create one using the developer edition. The default server included in SP5 is SQL Anywhere 16.0.0.1824. Open the cloud console and click on the Cloud Software link. You will see there is only one server version currently installed.                                        Capture2.PNG
  2. We will need to download the new server version from the SAP website. Browse the list of available server versions, and download the install package for the server version that you want to add to your cloud. For this demo, I am downloading the SQL Anywhere 12.0.1.4086 for Windows package.
  3. The install package must be added to the cloud before it can be installed. Click Events -> Run new task.Capture4.PNG
  4. Select AddCloudSoftwareUpateFromFile.Capture5.PNG
  5. Enter the full path to the install package you downloaded in Step 2. Click Next. This will add the install package to the cloud.                                    Capture6.PNG
  6. Click on the Cloud Software tab and confirm that new server version has been added to the cloud.Capture7.PNG
  7. Although the new servers version has been added to the cloud, it has not yet been installed. The install package only has to be added to the cloud once, but it also must be installed on every host on which you want to run that server version. This allows you to limit certain server versions to only specific hosts. Fortunately, the cloud console makes it easy to install the software on any host. Select the new cloud version and click Install.Capture8.PNG
  8. Enter the list of hosts on which you want the new server version installed.Capture9.PNG
  9. The cloud console will coordinate the installation of the new server version on all of the target hosts. Once the install is complete, you will see that the new server version appears in the list of Installed Cloud Software.Capture10.PNG
  10. When creating a new server, you will now be able to select the new server version.Capture11.PNG
Viewing all 128 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>