Remove duplicate rows from table in PostgreSQL

I was recently looking for a quick way to delete duplicate rows from large table (around half a billion rows) in PostgreSQL. In my case the table didn’t have a primary key – it was the result of importing a bunch of CSV files using the copy command. After some googling it turned out that the quickest way is also the easiest to understand. So here is how its done:

-- Create a new table with only unique rows:
create table temp as
    select distinct on (lon,lat,label) * from buildings;

-- Check the before and after counts:
select count(*) as before from buildings;
select count(*) as after from temp;

-- Now drop the original table
drop table buildings;
-- Rename the temporary one.
alter table temp rename to buildings;

Essentially, a new table is created with the duplicated rows removed then the original table is dropped and the new table is renamed. Keep in mind that this method doesn’t specify which row is kept so be careful if that is important to you.

In this case I only needed to have unique values in columns lon, lat and label. You can replace those with what fits you case. If you want to do a full table compare (i.e. your imported CSV files have full duplicated rows) you can use * instead like this:

create table temp as
    select distinct * from buildings;

PHP 7.4 Preloading

A cool new feature coming up with PHP 7.4 will give a nice speed boost while solving one of the problems that hold PHP back. As explained in the internals RFC we’ll be able to specify a single PHP file in php.ini (under the opcache.preload directive) which will take care of loading all other PHP files that contain shared code and keeping them in memory before any application code is executed. The preloader PHP script can either use the include command the traditional way or the new function opcache_compile_file to force scripts to be compiled and kept in memory. It will also resolve any dependencies and load them accordingly which wasn’t possible before using the opcache. The shared code will then be available to any PHP scripts running on the same instance without having to do any include/require or pass through composer’s autoloader.

Performance gains from this feature will depend mainly on how much actual work is done during each request as opposed to bootstrapping the framework. Requests will short run-time should see the highest boost in performance.

While this feature will benefit anyone running their applications isolated it will probably not be available for shared hosting scenarios where the same PHP instance is used for multiple sites.

JPHP – Compile PHP directly to JVM (Java) Bytecode

I stumbled upon the JPHP project today which adds yet another direction to the PHP world. JPHP is an one man project by Dmitriy Zayceff who’s been working on it for the past 2 years and I must admit he must have spent a lot of time on it. Here’s some key points:

  • Compile PHP code into Java Virtual Machine bytecode (into JAVa’s .class files)
  • Use the standard PHP function library including some extensions.
  • Can use any JAVA class through PHP code (you need to write a wrapper for it but it works)
  • Create cross platform GUI or CLI applications
  • Develop Android apps in PHP
  • It executes faster than the current PHP branch and even faster than PHP7.
  • Allows for both stateful and stateless execution when used as a Web scripting language (i.e. you can either choose to execute as a long-running process with shared memory or reset with every request like standard PHP.
  • Can do just in time compilation of PHP files.

Project’s Github page.

Benchmarks can be found here. Page is in Japanese but all you need to look at is the two tables. The first shows the time taken for the 1st run of the benchmarks and the next run where the code is already pre-compiled and JPHP wins.

Chrome DevTools – Quickly use currently selected element in console

I am sure like me, there’s been many times when using Chrome Developer Tools you selected an DOM element in the Elements panel and wished you could easily get a reference for it so you can use it in the console. It turns out it was possible and super-easy all along by using the $0 magic variable in Console. $0 automatically holds a reference to the currently selected element in the Elements panel. As a bonus the using $_ instead you get access to the last evaluated expression whether its an object or a scalar. Here’s how it looks:

$0 and $_ in Chrome Developer Tools

$0 and $_ in Chrome Developer Tools

 

 

Source: DevTips Daily

Recent & Worthwhile in Web Development

PHP

  • Remember Me – Safely Secure Long Term Authentication Strategies
    A detailed explanation on how to implement the Sign in ‘Remember Me’ functionality securely in PHP (the concepts are the same for every language). I must admit I never considered timing attacks for the authentication token until I read this.
  • Laravel 5.0 was released
    The 5th installment of one of the most popular PHP MVC and all-around frameworks was released. Wrote a summary of the changes in another post.
  • HHVM 3.5.0 Released
    The Facebook supported alternative PHP VM has a new version.
  • Cockpit
    A plug-n-play CMS for PHP sites. Targeted to developers, who built web apps and don’t want to re-invent the wheel every time for the content parts of their app. Uses SQLlite or MongoDB as data store.
  • Return Types Declarations approved for PHP7
    In PHP7 we’ll be able to declare the return type for functions and will be enforced causing a fatal error if a different type is returned.  Here’s an example of how it will look taken straight from the RFC:
    function foo(): array {
      return [];
    }

JS /HTML / CSS

  • ReactJS for Stupid People
    A good introduction on what Facebook’s ReactJS is about and why you should use it.
  • Animated Bezier Curves
    Demonstration of how parametric Bezier curves are constructed. Uses D3, and the code is straight forward, just View Source.
  • Konva JS
    Object-oriented 2D HTML5 Canvas Framework. Event driven, layer support, drag and drop, shape nesting, animation and all the usual bells and whistles.
  • mProgress.js
    Material Design style animated progress bar. Uses CSS3, has no other library dependencies and has beyond the ordinary features such as indeterminate progress (7.7kb minified JS).
  • RubaXa Sortable
    Minimal & customizable Sortable with touch support and no dependencies.
  • fieldVal
    Programmatic Javascript Field Validation library (runs both on browser and server). Includes an optional UI library for presenting validation errors in forms.
  • TauCharts
    Javascript charting library which aims to be flexible rather than offering pre-packaged chart setups.

PHP: array_walk vs foreach

Its a minor optimization but might make a difference if you process large arrays. Built-in PHP function array_walk allows you to apply a function to every element in an array. Obviously you can get the same results using a foreach. One would expect the built-in function to be faster since its written in C and hopefully has been highly optimized but it turns out, that’s not the case. After a bit of online research and some tests, foreach is the clear winner. Here’s the examples:

<?php
$test = array_fill(0, 10000, 'test_data');
$start = microtime(true);

foreach ($test as $key => $value)
{
    $result[$key] = 'testdata';
}

$end = microtime(true);
$time = $end - $start;
printf( "%0.5fs", $time );
Using foreach - 0.01022s

<?php
$test = array_fill(0, 10000, 'test_data');
$start = microtime(true);

array_walk($test, function($value, $index)
{
    $value = 'testdata';
});

$end = microtime(true);
$time = $end - $start;
printf( "%0.5fs", $time );
Using array_walk - 0.08700s

foreach wins. Case closed.

Laravel 5.0 Final released

  • New project directory structure – now looks more intuitive.
  • Now more components implement interfaces for better extensibility.
  • Route Caching for faster performance.
  • Middleware support for routes.
  • Common Authentication functionality out of the box (controllers for authentication, registration, forgot password and their respective views are all included).
  • Job Scheduler so you only need to run one cron job and the scheduler picks what to run from the queued jobs for you.
  • and more…

Have a look at the release notes.

New CSS draft includes support for Hex color values with alpha

By now you know that you can specify transparency in CSS color values (useful for properties like background-color) using the CSS RGBA function like this:

.transparent-blue {
   background-color: rgba(0, 0, 255, .5);
}

It turns out the new CSS Level 4 draft finally includes support for adding transparency to hex based color values – by using 8 digits instead of 6 giving 255 levels of opacity. Of course at the time this was being written its not supported by any browser but its good to know its coming. Here’s how it will look:

.new-transparent-blue {
   background-color: #0000ff88;
}

.new-transparent-blue-shorthand {
  background-color: #00f8;
}

How to mysqldump without auto increment values in Windows

In order to run PHPUnit tests I use a boostrap file which uses mysqldump to export the ever-changing database schema without any data, and import it into the testing database. After using this method for a while I stumbled on a problem with the auto-increment values for the table primary keys. There is no easy way to get mysqldump to avoid including the next auto-increment value in the dump. The problem this causes is that if your tests depend on specific primary key values things will go wrong the second time you run your test suite. There is a way to fix this by piping mysqldump’s output into sed. Here’s the actual command:

mysqldump -d [devDB] | sed 's/AUTO_INCREMENT=[0-9]*//' >

This will work under Linux but sadly not on Windows which doesn’t have sed but there’s a solution for that. (more…)

How to enable WordPress Debug Mode in 60 seconds

If you are a WordPress developer or if you just installed a misbehaving plugin you will need to enable WordPress’s debug mode in order to figure out what made your site get into the dreaded blank white page (unlike Windows WordPress has a White-Screen-Of-Death). So here’s the quick step-by-step guide to get you going:

Step by Step

  1. Connect via FTP to your site (use FileZilla which is free, if you don’t have an FTP client already installed) or just open your text editor if you are running local.  (15 seconds)
  2. Browse to WordPress home directory, and open wp-config.php for editing. (10 seconds)
  3. Hit Ctrl-Endor scroll to the bottom of the document, locate the line that says define(‘WP_DEBUG’, false);  – you can find it a few lines before the end of the file. (15 seconds)
  4. Change false to true and Save. (15 seconds)
  5. Refresh the page and you should be able to see the errors.  (5 seconds)

 

Total: 60 seconds.

 

Its recommended you don’t run this option on a live production site but if you really must, make sure you turn if back off when you are done. Alternatively you can enable the debug log to be sent to a file instead of the browser by also adding the following line in your config file:

define(‘WP_DEBUG_LOG’, true);

This will cause the error messages to be saved in a file called debug.log in the wp-content sub-folder of your installation. Again make sure this file is protected because by default anyone from the web can access it in the particular folder.

 

And if my instructions are not detailed enough you can look at the long version on the WordPress codex.