Scientific Computing, TU Berlin, WS 2019/2020, Lecture 03

Jürgen Fuhrmann, WIAS Berlin

Additional Info on Julia installation

  • There is Julia Pro, a Julia distribution with an additional registry of curated packages It comes bundeled with Juno for the Atom editor
  • All info on installation is collected on the course homepage

Thanks Obin Sturm for the hint...

Recap

  • General info
  • Adding packages
  • Assignments, simple data types
  • Vectors and matrices
  • Basic control structures

Julia type system

  • Julia is a strongly typed language
  • Knowledge about the layout of a value in memory is encoded in its type
  • Prerequisite for performance
  • There are concrete types and abstract types
  • See WikiBook for more

Concrete types

  • Every value in Julia has a concrete type
  • Concrete types correspond to computer representations of objects
  • Inquire type info using typeof()
  • One can initialize a variable with an explicitely given fixed type
    • Currently posible only in the body of funtions and for return values, not in the global context of Jupyter, REPL
In [1]:
function sometypes()
    i::Int8=10
    @show i,typeof(i)
    x::Float16=5.0
    @show x,typeof(x)
    z::Complex{Float32}=15+3im
    @show z,typeof(z)
    return z
end
z1=sometypes()
@show z1,typeof(z1);
(i, typeof(i)) = (10, Int8)
(x, typeof(x)) = (Float16(5.0), Float16)
(z, typeof(z)) = (15.0f0 + 3.0f0im, Complex{Float32})
(z1, typeof(z1)) = (15.0f0 + 3.0f0im, Complex{Float32})

Vectors and Matrices have concrete types as well:

In [2]:
function sometypesv()
    iv=zeros(Int8, 10)
    @show iv,typeof(iv)
    xv=[Float16(sin(x)) for x in 0:0.1:1]
    @show xv,typeof(xv)
    return xv
end
x1=sometypesv()
@show x1,typeof(x1);
(iv, typeof(iv)) = (Int8[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Array{Int8,1})
(xv, typeof(xv)) = (Float16[0.0, 0.09985, 0.1986, 0.2954, 0.3894, 0.4795, 0.5645, 0.644, 0.7173, 0.783, 0.8413], Array{Float16,1})
(x1, typeof(x1)) = (Float16[0.0, 0.09985, 0.1986, 0.2954, 0.3894, 0.4795, 0.5645, 0.644, 0.7173, 0.783, 0.8413], Array{Float16,1})

Structs allow to define user defined concrete types

In [3]:
struct Color64
    r::Float64
    g::Float64
    b::Float64
end
c=Color64(0.5,0.5,0.1)
@show c,typeof(c);
(c, typeof(c)) = (Color64(0.5, 0.5, 0.1), Color64)

Types can be parametrized (similar to array)

In [4]:
struct TColor{T}
    r::T
    g::T
    b::T
end
c=TColor{Float16}(0.5,0.5,0.1)
@show c,typeof(c);
(c, typeof(c)) = (TColor{Float16}(Float16(0.5), Float16(0.5), Float16(0.1)), TColor{Float16})

Functions, Methods and Multiple Dispatch

  • Functions can have different variants of their implementation depending on the types of parameters passed to them
  • These variants are called methods
  • All methods of a function f can be listed calling methods(f)
  • The act of figuring out which method of a function to call depending on the type of parameters is called multiple dispatch
In [5]:
function test_dispatch(x::Float64)
    println("dispatch: Float64, x=$(x)")
end
function test_dispatch(i::Int64)
    println("dispatch: Int64, i=$(i)")
end
test_dispatch(1.0)
test_dispatch(10)
methods(test_dispatch)
dispatch: Float64, x=1.0
dispatch: Int64, i=10
Out[5]:
2 methods for generic function test_dispatch:
  • test_dispatch(i::Int64) in Main at In[5]:5
  • test_dispatch(x::Float64) in Main at In[5]:2
  • Typically, Julia functions have lots of possible methods
  • Each method is compiled to different machine code and can be optimized for the particular parameter types
In [6]:
using LinearAlgebra
methods(det)
Out[6]:
23 methods for generic function det:

The function/method concept somehow corresponds to C++14 generic lambdas

 auto myfunc=[](auto  &y, auto &y)
   {
     y=sin(x);
   };

is equivalent to

In [7]:
function myfunc!(y,x)
    y=sin(x)
end
Out[7]:
myfunc! (generic function with 1 method)

Many generic programming approaches possible in C++ also work in Julia,

If not specified otherwise via parameter types, Julia functions are generic: "automatic auto"

Abstract types

  • Abstract types label concepts which work for a several concrete types without regard to their memory layout etc.
  • All variables with concrete types corresponding to a given abstract type (must) share a common interface
  • A common interface consists of a set of methods working for all types exhibiting this interface
  • The functionality of an abstract type is implicitely characterized by the methods working on it
  • "duck typing": use the "duck test" — "If it walks like a duck and it quacks like a duck, then it must be a duck" — to determine if an object can be used for a particular purpose

Examples of abstract types

In [8]:
function sometypesa()
    i::Integer=10
    @show i,typeof(i)
    x::Real=5.0
    @show x,typeof(x)
    z::Any=15+3im
    @show z,typeof(z)
    return z
end
sometypesa()
(i, typeof(i)) = (10, Int64)
(x, typeof(x)) = (5.0, Float64)
(z, typeof(z)) = (15 + 3im, Complex{Int64})
Out[8]:
15 + 3im

Though we try to force the variables to have an abstract type, they end up with having a conrete type which is compatible with the abstract type

The type tree

  • Types can have subtypes and a supertype
  • Concrete types are the leaves of the resulting type tree
  • Supertypes are necessarily abstract
  • There is only one supertype for every (abstract or concrete) type:
In [9]:
supertype(Float64)
Out[9]:
AbstractFloat
  • Abstract types can have several subtypes
In [10]:
using InteractiveUtils
subtypes(AbstractFloat)
Out[10]:
4-element Array{Any,1}:
 BigFloat
 Float16 
 Float32 
 Float64 
  • Concrete types have no subtypes
In [11]:
subtypes(Float64)
Out[11]:
0-element Array{Type,1}
  • "Any" is the root of the type tree and has itself as supertype
In [12]:
supertype(Any)
Out[12]:
Any

Walking the the type tree

In [13]:
function showtypetree(T, level=0)
    println("  " ^ level, T)
    for t in subtypes(T)
        showtypetree(t, level+1)
    end
end
showtypetree(Number)
Number
  Complex
  Real
    AbstractFloat
      BigFloat
      Float16
      Float32
      Float64
    AbstractIrrational
      Irrational
    Integer
      Bool
      Signed
        BigInt
        Int128
        Int16
        Int32
        Int64
        Int8
      Unsigned
        UInt128
        UInt16
        UInt32
        UInt64
        UInt8
    Rational

We can have a nicer walk through the type tree by implementing an interface method AbstractTrees.children for types:

In [14]:
using Pkg
Pkg.add("AbstractTrees")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
  Updating `~/.julia/environments/v1.2/Project.toml`
 [no changes]
  Updating `~/.julia/environments/v1.2/Manifest.toml`
 [no changes]
In [15]:
using AbstractTrees
In [16]:
AbstractTrees.children(x::Type) = subtypes(x)
AbstractTrees.print_tree(Number)
Number
├─ Complex
└─ Real
   ├─ AbstractFloat
   │  ├─ BigFloat
   │  ├─ Float16
   │  ├─ Float32
   │  └─ Float64
   ├─ AbstractIrrational
   │  └─ Irrational
   ├─ Integer
   │  ├─ Bool
   │  ├─ Signed
   │  │  ├─ BigInt
   │  │  ├─ Int128
   │  │  ├─ Int16
   │  │  ├─ Int32
   │  │  ├─ Int64
   │  │  └─ Int8
   │  └─ Unsigned
   │     ├─ UInt128
   │     ├─ UInt16
   │     ├─ UInt32
   │     ├─ UInt64
   │     └─ UInt8
   └─ Rational

Abstract types are used to dispatch between methods as well

In [17]:
function test_dispatch(x::AbstractFloat)
    println("dispatch: $(typeof(x)) <:AbstractFloat, x=$(x)")
end
function test_dispatch(i::Integer)
    println("dispatch: $(typeof(i)) <:Integer, i=$(i)")
end
test_dispatch(one(Float16))
test_dispatch(10)
methods(test_dispatch)
dispatch: Float16 <:AbstractFloat, x=1.0
dispatch: Int64, i=10
Out[17]:
4 methods for generic function test_dispatch:
  • test_dispatch(i::Int64) in Main at In[5]:5
  • test_dispatch(x::Float64) in Main at In[5]:2
  • test_dispatch(x::AbstractFloat) in Main at In[17]:2
  • test_dispatch(i::Integer) in Main at In[17]:5

Now, depending on the input type for test_dispatch, a generic or a specific method is called

Testing of type relationships

In [18]:
@show  Float64<: AbstractFloat
@show  Float64<: Integer
@show  Int16<: AbstractFloat;
Float64 <: AbstractFloat = true
Float64 <: Integer = false
Int16 <: AbstractFloat = false

The power of multiple dispatch

  • Multiple dispatch is one of the defining features of Julia
  • Combined with the the hierarchical type system it allows for powerful generic program design
  • New datatypes (different kinds of numbers, differently stored arrays/matrices) work with existing code once they implement the same interface as existent ones.
  • In some respects C++ comes close to it, but for the price of more and less obvious code

Just-in-time compilation and Performance

  • Just-in-time compilation is another feature setting Julia apart
  • Use the tools from the The LLVM Compiler Infrastructure Project to organize on-the-fly compilation of Julia code to machine code
  • Tradeoff: startup time for code execution in interactive situations
  • Multiple steps: Parse the code, analyze data types etc.
  • Intermediate results can be inspected using a number of macros

From [Introduction to Writing High Performance Julia](https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxibG9uem9uaWNzfGd4OjMwZjI2YTYzNDNmY2UzMmE) by D. Robinson

Inspecting the code transformation

Define a function

In [19]:
g(x)=x+x
@show g(2)
methods(g)
g(2) = 4
Out[19]:
1 method for generic function g:
  • g(x) in Main at In[19]:1

Parse into abstract syntax tree

In [20]:
@code_lowered g(2)
println("-------------------------------------")
@code_lowered g(2.0)
-------------------------------------
Out[20]:
CodeInfo(
1 ─ %1 = x + x
└──      return %1
)

Type inference according to input

In [21]:
@code_warntype g(2)
println("-------------------------------------")
@code_warntype g(2.0)
Variables
  #self#::Core.Compiler.Const(g, false)
  x::Int64

Body::Int64
1 ─ %1 = (x + x)::Int64
└──      return %1
-------------------------------------
Variables
  #self#::Core.Compiler.Const(g, false)
  x::Float64

Body::Float64
1 ─ %1 = (x + x)::Float64
└──      return %1

LLVM Bytecode

In [22]:
@code_llvm g(2)
println("-------------------------------------")
@code_llvm g(2.0)
;  @ In[19]:1 within `g'
define i64 @julia_g_17084(i64) {
top:
; ┌ @ int.jl:53 within `+'
   %1 = shl i64 %0, 1
; └
  ret i64 %1
}
-------------------------------------

;  @ In[19]:1 within `g'
define double @julia_g_17223(double) {
top:
; ┌ @ float.jl:395 within `+'
   %1 = fadd double %0, %0
; └
  ret double %1
}

Native assembler code

In [23]:
@code_native g(2)
println("-------------------------------------")
@code_native g(2.0)
	.text
; ┌ @ In[19]:1 within `g'
; │┌ @ In[19]:1 within `+'
	leaq	(%rdi,%rdi), %rax
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └
-------------------------------------
	.text
; ┌ @ In[19]:1 within `g'
; │┌ @ In[19]:1 within `+'
	vaddsd	%xmm0, %xmm0, %xmm0
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └

Performance

Macros for performance testing:

  • @elapsed: wall clock time used
  • @allocated: number of allocations
  • @time: @elapsed and @allocated together
  • @benchmark: Benchmarking small pieces of code

Time twice in order to skip compilation time

In [24]:
function ftest(v::AbstractVector)
    result=0
    for i=1:length(v)
        result=result+v[i]^2
    end
    return result
end
@time ftest(ones(Float64,100000))
@time ftest(ones(Float64,100000))
  0.013210 seconds (33.90 k allocations: 2.623 MiB)
  0.000282 seconds (7 allocations: 781.500 KiB)
Out[24]:
100000.0

Run for a different type

In [25]:
@time ftest(ones(Int64,100000))
@time ftest(ones(Int64,100000))
  0.010208 seconds (24.03 k allocations: 2.071 MiB)
  0.000120 seconds (7 allocations: 781.500 KiB)
Out[25]:
100000

Julia performace gotchas:

  • Variables changing types
    • Type change assumed to be always possible in global context (outside of a function)
    • Type change due to inconsequential programming
  • Memory allocations for intermediate results

Performance in global context

As an exception, for this example we use the CPUTime package which works without macro expansion.

In [26]:
Pkg.add("CPUTime")
using CPUTime
 Resolving package versions...
  Updating `~/.julia/environments/v1.2/Project.toml`
 [no changes]
  Updating `~/.julia/environments/v1.2/Manifest.toml`
 [no changes]

Declare a long vector

In [27]:
myvec=ones(Float64,1000000);

Sum up its values

In [28]:
CPUtic()
begin
    x=0.0
    for i=1:length(myvec)
        global x
        x=x+myvec[i]
    end
    @show x
end
CPUtoc();
x = 1.0e6
elapsed CPU time: 0.251234 seconds

Alternatively, put the sum into a function

In [29]:
function mysum(v)
    x=0.0
    for i=1:length(v)
        x=x+v[i]
    end
    return x
end
Out[29]:
mysum (generic function with 1 method)

Run again

In [30]:
CPUtic()
begin
   @show mysum(myvec)
end
CPUtoc();
mysum(myvec) = 1.0e6
elapsed CPU time: 0.008832 seconds

What happened ?

Julia Gotcha #1: The REPL (terminal) is the Global Scope.

  • So is the Jupyter notebook
  • Julia is unable to dispatch on variable types in the global scope as they can change their type anytime
  • In the global context it has to put all variables into "boxes" allowing to dispatch on ther type at runtime
  • Avoid this situation by always wrapping your critical code into functions

Type stability

Use @benchmark for testing small functions

In [31]:
Pkg.add("BenchmarkTools")
using BenchmarkTools
 Resolving package versions...
  Updating `~/.julia/environments/v1.2/Project.toml`
 [no changes]
  Updating `~/.julia/environments/v1.2/Manifest.toml`
 [no changes]
In [32]:
function g()
  x=1
  for i = 1:10
    x = x/2
  end
  return x
end
@benchmark g()
Out[32]:
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     8.839 ns (0.00% GC)
  median time:      9.942 ns (0.00% GC)
  mean time:        10.061 ns (0.00% GC)
  maximum time:     42.178 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999
In [33]:
function h()
  x=1.0
  for i = 1:10
    x = x/2
  end
  return x
end
@benchmark h()
Out[33]:
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.207 ns (0.00% GC)
  median time:      1.213 ns (0.00% GC)
  mean time:        1.235 ns (0.00% GC)
  maximum time:     27.493 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

What happened ?

Gotcha #2: Type instabilities

In [34]:
@code_native g()
	.text
; ┌ @ In[32]:2 within `g'
	pushq	%rax
	movb	$2, %dl
	movl	$1, %ecx
	movabsq	$139763379075000, %rax  # imm = 0x7F1D328FF7B8
	vmovsd	(%rax), %xmm0           # xmm0 = mem[0],zero
	movl	$9, %eax
	movabsq	$139763379075008, %rsi  # imm = 0x7F1D328FF7C0
	vmovsd	(%rsi), %xmm1           # xmm1 = mem[0],zero
	andb	$3, %dl
; │ @ In[32]:4 within `g'
	cmpb	$1, %dl
	jne	L83
	jmp	L93
	nopw	%cs:(%rax,%rax)
L64:
	vmovq	%xmm0, %rcx
	addq	$-1, %rax
	movb	$1, %dl
	andb	$3, %dl
	cmpb	$1, %dl
	je	L93
L83:
	cmpb	$2, %dl
	jne	L104
; │┌ @ int.jl:59 within `/'
; ││┌ @ float.jl:271 within `float'
; │││┌ @ float.jl:256 within `Type' @ float.jl:60
	vcvtsi2sdq	%rcx, %xmm2, %xmm0
; │└└└
; │┌ @ float.jl:401 within `/'
L93:
	vmulsd	%xmm1, %xmm0, %xmm0
; │└
; │┌ @ range.jl:595 within `iterate'
; ││┌ @ promotion.jl:403 within `=='
	testq	%rax, %rax
; │└└
	jne	L64
; │ @ In[32]:6 within `g'
	popq	%rax
	retq
; │ @ In[32]:4 within `g'
L104:
	movabsq	$jl_throw, %rax
	movabsq	$jl_system_image_data, %rdi
	callq	*%rax
	nop
; └
In [35]:
@code_native h()
	.text
; ┌ @ In[33]:2 within `h'
	movabsq	$139763379075120, %rax  # imm = 0x7F1D328FF830
; │ @ In[33]:6 within `h'
	vmovsd	(%rax), %xmm0           # xmm0 = mem[0],zero
	retq
	nop
; └

Once again, "boxing" occurs to handle x: in g() it changes its type from Int64 to Float64:

In [36]:
@code_warntype g()
Variables
  #self#::Core.Compiler.Const(g, false)
  x::Union{Float64, Int64}
  @_3::Union{Nothing, Tuple{Int64,Int64}}
  i::Int64

Body::Float64
1 ─       (x = 1)
 %2  = (1:10)::Core.Compiler.Const(1:10, false)
       (@_3 = Base.iterate(%2))
 %4  = (@_3::Core.Compiler.Const((1, 1), false) === nothing)::Core.Compiler.Const(false, false)
 %5  = Base.not_int(%4)::Core.Compiler.Const(true, false)
└──       goto #4 if not %5
2 ┄ %7  = @_3::Tuple{Int64,Int64}::Tuple{Int64,Int64}
       (i = Core.getfield(%7, 1))
 %9  = Core.getfield(%7, 2)::Int64
       (x = x / 2)
       (@_3 = Base.iterate(%2, %9))
 %12 = (@_3 === nothing)::Bool
 %13 = Base.not_int(%12)::Bool
└──       goto #4 if not %13
3 ─       goto #2
4 ┄       return x::Float64
In [37]:
@code_warntype h()
Variables
  #self#::Core.Compiler.Const(h, false)
  x::Float64
  @_3::Union{Nothing, Tuple{Int64,Int64}}
  i::Int64

Body::Float64
1 ─       (x = 1.0)
 %2  = (1:10)::Core.Compiler.Const(1:10, false)
       (@_3 = Base.iterate(%2))
 %4  = (@_3::Core.Compiler.Const((1, 1), false) === nothing)::Core.Compiler.Const(false, false)
 %5  = Base.not_int(%4)::Core.Compiler.Const(true, false)
└──       goto #4 if not %5
2 ┄ %7  = @_3::Tuple{Int64,Int64}::Tuple{Int64,Int64}
       (i = Core.getfield(%7, 1))
 %9  = Core.getfield(%7, 2)::Int64
       (x = x / 2)
       (@_3 = Base.iterate(%2, %9))
 %12 = (@_3 === nothing)::Bool
 %13 = Base.not_int(%12)::Bool
└──       goto #4 if not %13
3 ─       goto #2
4 ┄       return x

So, when in doubt, explicitely declare types of variables

Structuring your code: modules, files and packages

  • Complex code is split up into several files
  • Avoid name clashes for code from different places
  • Organize the way to use third party code

Modules

  • Modules allow to encapsulate implementation into different namespaces
In [38]:
module TestModule
function mtest(x)
    println("mtest: x=$(x)")
end
export mtest
end
Out[38]:
Main.TestModule
  • Module content can be accesse via qualified names
In [39]:
TestModule.mtest(13)
mtest: x=13
  • "using" makes all exported content of a module available without prefixing
  • The '.' before the module name refers to local modules defined in the same file
In [40]:
using .TestModule
mtest(23)
mtest: x=23

Finding modules in the file system

  • Put single file modules having the same name as the module into a directory which in on the LOAD_PATH
  • Call "using" or "import" with the module
  • You can modify your LOAD_PATH by adding e.g. the actual directory
In [41]:
push!(LOAD_PATH, pwd())
Out[41]:
5-element Array{String,1}:
 "@"                                       
 "@v#.#"                                   
 "@stdlib"                                 
 "/home/fuhrmann/Wias/teach/scicomp/course"
 "/home/fuhrmann/Wias/teach/scicomp/course"

Do this e.g. in the startup file .julia/config/startup.jl

  • Create a module in the file system (normally, use your editor...) (yes, we can have multiline strings and " in them with """ ...)
In [42]:
open("TestModule1.jl", "w") do io
    write(io, """
    module TestModule1
    function mtest1(x)
        println("mtest1: x=",x)
    end
    export mtest1
    end
    """)
end
Out[42]:
88
  • Import, enabling qualified access
In [43]:
using TestModule1
TestModule1.mtest1(23)
┌ Info: Recompiling stale cache file /home/fuhrmann/.julia/compiled/v1.2/TestModule1.ji for TestModule1 [top-level]
└ @ Base loading.jl:1240
mtest1: x=23
  • Import, enabling unqualified access of
In [44]:
using TestModule1
mtest1(23)
mtest1: x=23

Packages in the file system

  • Packages are found via the same mechanism
  • Part of the load path are the directory with downloaded packages and the directory with packages under development
  • Each package is a directory named Package with a subdirectory src
  • The file Package/src/Package.jl defines a module named Package
  • More structures in a package:
    • Documentation build recipes
    • Test code
    • Dependency description
    • UUID (Universal unique identifier)
  • Default packages (e.g. the package manager Pkg) are always available
  • Use the package manager to checkout a new package via the registry

Including code from files

  • The include statement allows just to include the code in a given file
In [45]:
open("myfile.jl", "w") do io
    write(io, """myfiletest(x)=println("myfiletest: x=",x)""")
end
Out[45]:
41
In [46]:
include("myfile.jl")
myfiletest(23)
myfiletest: x=23

How to return homework

  • For homework assignements I want you to write single file modules with a standard structure
In [47]:
module MyHomework
function main(;optional_parameter)
    println("Hello World")
end
end
Out[47]:
Main.MyHomework

This notebook was generated using Literate.jl.