I currently deal with large quantities of data. So much, in fact, that our project has stored procedures that will truncate tables because there is not enough tablespace in order to handle delete queries. This works fine if I want to delete all the records in a table, but what if I only want to delete a subset, let's say 50,000 of the multiple million records? With groovy, and it's built in SQL functionality, this becomes very easy to accomplish. Here is the complete program:
import groovy.sql.Sql
def sql = Sql.newInstance("${jdbc.url}", "${jdbc.userId}", "${jdbc.password}", "${jdbc.driver})
def query = "select customer_id from customer where customer_group_id = 11020"
sql.eachRow(query as String, {row -> sql.execute("delete from customer where customer_id = ?", [row.CUSTOMER_ID])})
The Sql object represents the connection to the database. The first thing we do is open the connection to the database using the newInstance method which takes in a valid jdbc url, db user id, db password, and jdbc driver.
Note: The JDBC Driver must exist on the classpath when executing this script.
The query variable holds the sql statement that will return me all of the id's of the customers that I would like to delete from the customer table.
Using the eachRow method on the Sql object, as each row is returned, I execute the anonymous in line closure...
{row -> sql.execute("delete from customer where customer_id = ?", [row.CUSTOMER_ID])}
Which will execute a delete statement for each record returned from the original query.
Running the application is just as simple as well...
groovy -cp .\lib\<> deleteCustomers.groovy
This is one of the many uses that I have found for this simple, yet powerful dynamic language. I hope to share more with you in the future.
Enjoy.
No comments:
Post a Comment